FinLex: An effective use of word embeddings for financial lexicon generation

We present a simple and effective methodology for the generation of lexicons (word lists) that may be used in natural language scoring applications. In particular, in the finance industry, word lists have become ubiquitous for sentiment scoring. These have been derived from dictionaries such as the...

Full description

Saved in:
Bibliographic Details
Main Authors: Sanjiv R. Das, Michele Donini, Muhammad Bilal Zafar, John He, Krishnaram Kenthapadi
Format: article
Language:EN
Published: KeAi Communications Co., Ltd. 2022
Subjects:
Online Access:https://doaj.org/article/a9e7fbaa425b46cd8cd98d2d853264f8
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:We present a simple and effective methodology for the generation of lexicons (word lists) that may be used in natural language scoring applications. In particular, in the finance industry, word lists have become ubiquitous for sentiment scoring. These have been derived from dictionaries such as the Harvard Inquirer and require manual curation. Here, we present an automated approach to the curation of lexicons, which makes automatic preparation of any word list immediate. We show that our automated word lists deliver comparable performance to traditional lexicons on machine learning classification tasks. This new approach will enable finance academics and practitioners to create and deploy new word lists in addition to the few traditional ones in a facile manner.