TY - JOUR
T1 - The “Podcast” ECoG dataset for modeling neural activity during natural language comprehension
AU - Zada, Zaid
AU - Nastase, Samuel A.
AU - Aubrey, Bobbi
AU - Jalon, Itamar
AU - Michelmann, Sebastian
AU - Wang, Haocheng
AU - Hasenfratz, Liat
AU - Doyle, Werner
AU - Friedman, Daniel
AU - Dugan, Patricia
AU - Melloni, Lucia
AU - Devore, Sasha
AU - Flinker, Adeen
AU - Devinsky, Orrin
AU - Goldstein, Ariel
AU - Hasson, Uri
N1 - Publisher Copyright:
© The Author(s) 2025.
PY - 2025/12
Y1 - 2025/12
N2 - Naturalistic electrocorticography (ECoG) data are a rare but essential resource for studying the brain’s linguistic capabilities. ECoG offers high temporal resolution suitable for investigating processes at multiple temporal timescales and frequency bands. It also provides broad spatial coverage, often along critical language areas. Here, we share a dataset of nine ECoG participants with 1,330 electrodes listening to a 30-minute audio podcast. The richness of this naturalistic stimulus can be used for various research questions, from auditory perception to narrative integration. In addition to the neural data, we extracted linguistic features of the stimulus ranging from phonetic information to large language model word embeddings. We use these linguistic features in encoding models that relate stimulus properties to neural activity. Finally, we provide detailed tutorials for preprocessing raw data, extracting stimulus features, and running encoding analyses that can serve as a pedagogical resource or a springboard for new research.
AB - Naturalistic electrocorticography (ECoG) data are a rare but essential resource for studying the brain’s linguistic capabilities. ECoG offers high temporal resolution suitable for investigating processes at multiple temporal timescales and frequency bands. It also provides broad spatial coverage, often along critical language areas. Here, we share a dataset of nine ECoG participants with 1,330 electrodes listening to a 30-minute audio podcast. The richness of this naturalistic stimulus can be used for various research questions, from auditory perception to narrative integration. In addition to the neural data, we extracted linguistic features of the stimulus ranging from phonetic information to large language model word embeddings. We use these linguistic features in encoding models that relate stimulus properties to neural activity. Finally, we provide detailed tutorials for preprocessing raw data, extracting stimulus features, and running encoding analyses that can serve as a pedagogical resource or a springboard for new research.
UR - https://www.scopus.com/pages/publications/105010068460
U2 - 10.1038/s41597-025-05462-2
DO - 10.1038/s41597-025-05462-2
M3 - ???researchoutput.researchoutputtypes.contributiontojournal.article???
C2 - 40610484
AN - SCOPUS:105010068460
SN - 2052-4463
VL - 12
JO - Scientific data
JF - Scientific data
IS - 1
M1 - 1135
ER -