Beyond sentiment: an algorithmic strategy for identifying evaluations within large text corpora

Maximilian Overbeck*, Christian Baden, Tali Aharoni, Eedan Amit-Danhi, Keren Tenenboim-Weinblatt

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review


In this paper, we propose a new strategy for classifying evaluations in large text corpora, using supervised machine learning (SML). Departing from a conceptual and methodological critique of the use of sentiment measures to recognize object-specific evaluations, we argue that a key challenge consists in determining whether a semantic relationship exists between evaluative expressions and evaluated objects. Regarding sentiment terms as merely potentially evaluative expressions, we thus use a SML classifier to decide whether recognized terms have an evaluative function in relation to the evaluated object. We train and test our classifier on a corpus of 10,004 segments of election coverage from 16 major U.S. news outlets and Tweets by 10 prominent U.S. politicians and journalists. Specifically, we focus on evaluations of political predictions about the outcomes and implications of the 2016 and 2020 U.S. presidential elections. We show that our classifier consistently outperforms both off-the-shelf sentiment tools and a pre-trained transformer-based sentiment classifier. Critically, our classifier correctly discards numerous non-evaluative uses of common sentiment terms, whose inclusion in conventional analyses generates large amounts of false positives. We discuss contributions of our approach to the measurement of object-specific evaluations and highlight challenges for future research.

Original languageAmerican English
JournalCommunication Methods and Measures
StateAccepted/In press - 2023

Bibliographical note

Publisher Copyright:
© 2023 The Author(s). Published with license by Taylor & Francis Group, LLC.


Dive into the research topics of 'Beyond sentiment: an algorithmic strategy for identifying evaluations within large text corpora'. Together they form a unique fingerprint.

Cite this