Clause analysis: Using syntactic information to automatically extract source, subject, and predicate from texts with an application to the 2008-2009 Gaza War

Wouter van Atteveldt*, Tamir Sheafer, Shaul R. Shenhav, Yair Fogel-Dror

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

24 Scopus citations

Abstract

This article presents a new method and open source R package that uses syntactic information to automatically extract source-subject-predicate clauses. This improves on frequency-based text analysis methods by dividing text into predicates with an identified subject and optional source, extracting the statements and actions of (political) actors as mentioned in the text. The content of these predicates can be analyzed using existing frequency-based methods, allowing for the analysis of actions, issue positions and framing by different actors within a single text. We showthat a small set of syntactic patterns can extract clauses and identify quotes with good accuracy, significantly outperforming a baseline system based on word order. Taking the 2008-2009 Gaza war as an example, we further show how corpus comparison and semantic network analysis applied to the results of the clause analysis can show differences in citation and framing patterns between U.S. and English-language Chinese coverage of this war.

Original languageEnglish
Pages (from-to)207-222
Number of pages16
JournalPolitical Analysis
Volume25
Issue number2
DOIs
StatePublished - 1 Apr 2017

Bibliographical note

Publisher Copyright:
© The Author(s) 2017.

Fingerprint

Dive into the research topics of 'Clause analysis: Using syntactic information to automatically extract source, subject, and predicate from texts with an application to the 2008-2009 Gaza War'. Together they form a unique fingerprint.

Cite this