Disambiguating company names in microblog text using clustering for online reputation management

Twitter is used by millions of users to publish brief messages (tweets) with the purpose of sharing experiences and/or opinions about a product or service. There is a clear need for systems that can mine these messages in order to derive information about the collective thinking of twitterers (e.g....

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Pérez-Tellez,Fernando, Cardiff,John, Rosso,Paolo, Pinto,David
Lenguaje:English
Publicado: Pontificia Universidad Católica de Valparaíso. Instituto de Literatura y Ciencias del Lenguaje 2015
Materias:
Acceso en línea:http://www.scielo.cl/scielo.php?script=sci_arttext&pid=S0718-09342015000100003
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:scielo:S0718-09342015000100003
record_format dspace
spelling oai:scielo:S0718-093420150001000032015-03-03Disambiguating company names in microblog text using clustering for online reputation managementPérez-Tellez,FernandoCardiff,JohnRosso,PaoloPinto,David Clustering of tweets opinion analysis disambiguation online reputation management Twitter is used by millions of users to publish brief messages (tweets) with the purpose of sharing experiences and/or opinions about a product or service. There is a clear need for systems that can mine these messages in order to derive information about the collective thinking of twitterers (e.g. for opinion or sentiment analysis). Tweet analysis is a very important task because comments, opinions, suggestions, complaints etc. can be used for marketing strategies or for determining information on a company’s reputation. For this purpose, it is necessary to automatically establish whether a tweet refers to a company or not, when the company name is ambiguous. This task is not a straightforward keyword search process as there may be multiple contexts in which a name can be used. The aim of this study is to present and compare four different approaches which improve the representation of short texts for better performance of the clustering task that determine whether a given tweet refers to a particular company or not. For this purpose, we have used a variety of enriching methodologies based on term expansion via the semantic similarity hidden behind the lexical structure, in order to improve the representation of tweets and as a consequence the performance of the task. We have used two different tweet datasets of company names which contain different levels of ambiguity. The results are promising although they highlight the difficulty of this task.info:eu-repo/semantics/openAccessPontificia Universidad Católica de Valparaíso. Instituto de Literatura y Ciencias del LenguajeRevista signos v.48 n.87 20152015-03-01text/htmlhttp://www.scielo.cl/scielo.php?script=sci_arttext&pid=S0718-09342015000100003en10.4067/S0718-09342015000100003
institution Scielo Chile
collection Scielo Chile
language English
topic Clustering of tweets
opinion analysis
disambiguation
online reputation management
spellingShingle Clustering of tweets
opinion analysis
disambiguation
online reputation management
Pérez-Tellez,Fernando
Cardiff,John
Rosso,Paolo
Pinto,David
Disambiguating company names in microblog text using clustering for online reputation management
description Twitter is used by millions of users to publish brief messages (tweets) with the purpose of sharing experiences and/or opinions about a product or service. There is a clear need for systems that can mine these messages in order to derive information about the collective thinking of twitterers (e.g. for opinion or sentiment analysis). Tweet analysis is a very important task because comments, opinions, suggestions, complaints etc. can be used for marketing strategies or for determining information on a company’s reputation. For this purpose, it is necessary to automatically establish whether a tweet refers to a company or not, when the company name is ambiguous. This task is not a straightforward keyword search process as there may be multiple contexts in which a name can be used. The aim of this study is to present and compare four different approaches which improve the representation of short texts for better performance of the clustering task that determine whether a given tweet refers to a particular company or not. For this purpose, we have used a variety of enriching methodologies based on term expansion via the semantic similarity hidden behind the lexical structure, in order to improve the representation of tweets and as a consequence the performance of the task. We have used two different tweet datasets of company names which contain different levels of ambiguity. The results are promising although they highlight the difficulty of this task.
author Pérez-Tellez,Fernando
Cardiff,John
Rosso,Paolo
Pinto,David
author_facet Pérez-Tellez,Fernando
Cardiff,John
Rosso,Paolo
Pinto,David
author_sort Pérez-Tellez,Fernando
title Disambiguating company names in microblog text using clustering for online reputation management
title_short Disambiguating company names in microblog text using clustering for online reputation management
title_full Disambiguating company names in microblog text using clustering for online reputation management
title_fullStr Disambiguating company names in microblog text using clustering for online reputation management
title_full_unstemmed Disambiguating company names in microblog text using clustering for online reputation management
title_sort disambiguating company names in microblog text using clustering for online reputation management
publisher Pontificia Universidad Católica de Valparaíso. Instituto de Literatura y Ciencias del Lenguaje
publishDate 2015
url http://www.scielo.cl/scielo.php?script=sci_arttext&pid=S0718-09342015000100003
work_keys_str_mv AT pereztellezfernando disambiguatingcompanynamesinmicroblogtextusingclusteringforonlinereputationmanagement
AT cardiffjohn disambiguatingcompanynamesinmicroblogtextusingclusteringforonlinereputationmanagement
AT rossopaolo disambiguatingcompanynamesinmicroblogtextusingclusteringforonlinereputationmanagement
AT pintodavid disambiguatingcompanynamesinmicroblogtextusingclusteringforonlinereputationmanagement
_version_ 1714201846400155648