Semi-supervised recognition of sarcastic sentences in twitter and Amazon

Dmitry Davidov*, Oren Tsur, Ari Rappoport

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

430 Scopus citations

Abstract

Sarcasm is a form of speech act in which the speakers convey their message in an implicit way. The inherently ambiguous nature of sarcasm sometimes makes it hard even for humans to decide whether an utterance is sarcastic or not. Recognition of sarcasm can benefit many sentiment analysis NLP applications, such as review summarization, dialogue systems and review ranking systems. In this paper we experiment with semi-supervised sarcasm identification on two very different data sets: a collection of 5.9 million tweets collected from Twitter, and a collection of 66000 product reviews from Amazon. Using the Mechanical Turk we created a gold standard sample in which each sentence was tagged by 3 annotators, obtaining F-scores of 0.78 on the product reviews dataset and 0.83 on the Twitter dataset. We discuss the differences between the datasets and how the algorithm uses them (e.g., for the Amazon dataset the algorithm makes use of structured information). We also discuss the utility of Twitter #sarcasm hashtags for the task.

Original languageEnglish
Title of host publicationCoNLL 2010 - Fourteenth Conference on Computational Natural Language Learning, Proceedings of the Conference
Pages107-116
Number of pages10
StatePublished - 2010
Event14th Conference on Computational Natural Language Learning, CoNLL 2010 - Uppsala, Sweden
Duration: 15 Jul 201016 Jul 2010

Publication series

NameCoNLL 2010 - Fourteenth Conference on Computational Natural Language Learning, Proceedings of the Conference

Conference

Conference14th Conference on Computational Natural Language Learning, CoNLL 2010
Country/TerritorySweden
CityUppsala
Period15/07/1016/07/10

Fingerprint

Dive into the research topics of 'Semi-supervised recognition of sarcastic sentences in twitter and Amazon'. Together they form a unique fingerprint.

Cite this