Three Gaps in Computational Text Analysis Methods for Social Sciences: A Research Agenda

Christian Baden*, Christian Pipal, Martijn Schoonvelde, Mariken A.C.G. van der Velden

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

73 Scopus citations


We identify three gaps that limit the utility and obstruct the progress of computational text analysis methods (CTAM) for social science research. First, we contend that CTAM development has prioritized technological over validity concerns, giving limited attention to the operationalization of social scientific measurements. Second, we identify a mismatch between CTAMs’ focus on extracting specific contents and document-level patterns, and social science researchers’ need for measuring multiple, often complex contents in the text. Third, we argue that the dominance of English language tools depresses comparative research and inclusivity toward scholarly communities examining languages other than English. We substantiate our claims by drawing upon a broad review of methodological work in the computational social sciences, as well as an inventory of leading research publications using quantitative textual analysis. Subsequently, we discuss implications of these three gaps for social scientists’ uneven uptake of CTAM, as well as the field of computational social science text research as a whole. Finally, we propose a research agenda intended to bridge the identified gaps and improve the validity, utility, and inclusiveness of CTAM.

Original languageAmerican English
Pages (from-to)1-18
Number of pages18
JournalCommunication Methods and Measures
Issue number1
StatePublished - 2022

Bibliographical note

Publisher Copyright:
© 2021 The Author(s). Published with license by Taylor & Francis Group, LLC.


Dive into the research topics of 'Three Gaps in Computational Text Analysis Methods for Social Sciences: A Research Agenda'. Together they form a unique fingerprint.

Cite this