More than words: Frequency effects for multi-word phrases

Inbal Arnon*, Neal Snider

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

399 Scopus citations

Abstract

There is mounting evidence that language users are sensitive to distributional information at many grain-sizes. Much of this research has focused on the distributional properties of words, the units they consist of (morphemes, phonemes), and the syntactic structures they appear in (verb-categorization frames, syntactic constructions). In a series of studies we show that comprehenders are also sensitive to the frequencies of compositional four-word phrases (e.g. don't have to worry): more frequent phrases are processed faster. The effect is not reducible to the frequency of the individual words or substrings and is observed across the entire frequency range (for low, mid and high frequency phrases). Comprehenders seem to learn and store frequency information about multi-word phrases. These findings call for processing models that can capture and predict phrase-frequency effects and support accounts where linguistic knowledge consists of patterns of varying sizes and levels of abstraction.

Original languageAmerican English
Pages (from-to)67-82
Number of pages16
JournalJournal of Memory and Language
Volume62
Issue number1
DOIs
StatePublished - Jan 2010
Externally publishedYes

Bibliographical note

Funding Information:
This work was supported by a Stanford Graduate Fellowship given to both authors, and NSF Award No. IS-0624345. We would like to thank Dan Jurafsky for guidance and support. We thank Florian Jaeger, Dan Jurafsky, Victor Kuperman, Meghan Sumner, and Harry Tily for helpful comments and suggestions, as well as the audience at the 83rd meeting of the Linguistic Society of America.

Keywords

  • Comprehension
  • Frequency
  • Lexicon
  • Ngram
  • Usage-based models

Fingerprint

Dive into the research topics of 'More than words: Frequency effects for multi-word phrases'. Together they form a unique fingerprint.

Cite this