The “Podcast” ECoG dataset for modeling neural activity during natural language comprehension

Zaid Zada*, Samuel A. Nastase, Bobbi Aubrey, Itamar Jalon, Sebastian Michelmann, Haocheng Wang, Liat Hasenfratz, Werner Doyle, Daniel Friedman, Patricia Dugan, Lucia Melloni, Sasha Devore, Adeen Flinker, Orrin Devinsky, Ariel Goldstein, Uri Hasson

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

1 Scopus citations

Abstract

Naturalistic electrocorticography (ECoG) data are a rare but essential resource for studying the brain’s linguistic capabilities. ECoG offers high temporal resolution suitable for investigating processes at multiple temporal timescales and frequency bands. It also provides broad spatial coverage, often along critical language areas. Here, we share a dataset of nine ECoG participants with 1,330 electrodes listening to a 30-minute audio podcast. The richness of this naturalistic stimulus can be used for various research questions, from auditory perception to narrative integration. In addition to the neural data, we extracted linguistic features of the stimulus ranging from phonetic information to large language model word embeddings. We use these linguistic features in encoding models that relate stimulus properties to neural activity. Finally, we provide detailed tutorials for preprocessing raw data, extracting stimulus features, and running encoding analyses that can serve as a pedagogical resource or a springboard for new research.

Original languageEnglish
Article number1135
JournalScientific data
Volume12
Issue number1
DOIs
StatePublished - Dec 2025

Bibliographical note

Publisher Copyright:
© The Author(s) 2025.

Fingerprint

Dive into the research topics of 'The “Podcast” ECoG dataset for modeling neural activity during natural language comprehension'. Together they form a unique fingerprint.

Cite this