Leveraging Prompt-Learning for Structured Information Extraction from Crohn’s Disease Radiology Reports in a Low-Resource Language

Liam Hazan, Gili Focht, Naama Gavrielov, Roi Reichart, Talar Hagopian, Mary Louise C. Greer, Ruth Cytter Kuint, Dan Turner, Moti Freiman

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Automatic conversion of free-text radiology reports into structured data using Natural Language Processing (NLP) techniques is crucial for analyzing diseases on a large scale. While effective for tasks in widely spoken languages like English, generative large language models (LLMs) typically underperform with less common languages and can pose potential risks to patient privacy. Fine-tuning local NLP models is hindered by the skewed nature of real-world medical datasets, where rare findings represent a significant data imbalance. We introduce SMP-BERT, a novel prompt learning method that leverages the structured nature of reports to overcome these challenges. In our studies involving a substantial collection of Crohn’s disease radiology reports in Hebrew (over 8,000 patients and 10,000 reports), SMP-BERT greatly surpassed traditional fine-tuning methods in performance, notably in detecting infrequent conditions (AUC: 0.99 vs 0.94, F1: 0.84 vs 0.34). SMP-BERT empowers more accurate AI diagnostics available for low-resource languages.

Original languageEnglish
Title of host publicationClinicalNLP 2024 - 6th Workshop on Clinical Natural Language Processing, Proceedings of the Workshop
EditorsTristan Naumann, Asma Ben Abacha, Steven Bethard, Kirk Roberts, Danielle Bitterman
PublisherAssociation for Computational Linguistics (ACL)
Pages301-309
Number of pages9
ISBN (Electronic)9798891761094
StatePublished - 2024
Externally publishedYes
Event6th Workshop on Clinical Natural Language Processing, ClinicalNLP 2024, held at NAACL 2024 - Mexico City, Mexico
Duration: 21 Jun 2024 → …

Publication series

NameClinicalNLP 2024 - 6th Workshop on Clinical Natural Language Processing, Proceedings of the Workshop

Conference

Conference6th Workshop on Clinical Natural Language Processing, ClinicalNLP 2024, held at NAACL 2024
Country/TerritoryMexico
CityMexico City
Period21/06/24 → …

Bibliographical note

Publisher Copyright:
© 2024 Association for Computational Linguistics.

Fingerprint

Dive into the research topics of 'Leveraging Prompt-Learning for Structured Information Extraction from Crohn’s Disease Radiology Reports in a Low-Resource Language'. Together they form a unique fingerprint.

Cite this