Evaluating empathy in GPT-4-generated vs. physician-written emergency department discharge letters

  • Gal Ben-Haim
  • , Adva Livne*
  • , Uri Manor
  • , David Hochstein
  • , Mor Saban
  • , Orly Blaier
  • , Yael Abramov Iram
  • , Moran Gigi Balzam
  • , Ariel Lutenberg
  • , Rowand Eyade
  • , Roula Qassem
  • , Dan Trabelsi
  • , Yarden Dahari
  • , Ben Zion Eisenmann
  • , Yelena Shechtman
  • , Girish N. Nadkarni
  • , Benjamin S. Glicksberg
  • , Eyal Zimlichman
  • , Anat Perry
  • , Eyal Klang
  • *Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Background and Aims: Empathy improves clinical outcomes, patient satisfaction, and adherence to treatment. Few studies have explored the real-world use of large language models in conveying empathy. We compared the empathy in emergency department (ED) discharge letters written by GPT-4 and physicians. Methods: We conducted a retrospective, blinded, comparative study in a tertiary ED. All patients discharged for one 8-h shift were included. For each patient, we compared the original ED discharge letter to a GPT-4 generated letter. GPT-4 generated the letters using ED notes, excluding the original discharge letter. Seventeen evaluators (seven physicians, five nurses, five patients) compared the letters side by side. They were blinded to the source. Evaluators first chose between the AI and human letters. Then they rated each letter for empathy, overall quality, clarity of summary, and clarity of recommendations using a 5-point Likert scale. Results: Evaluators preferred GPT-4 over physician letters in 83.7% of comparisons (1009 vs. 197; p < 0.001). GPT-4 letters received higher scores for empathy (median 4.0 vs. 3.0; p < 0.001), overall quality, and clarity of summary across all evaluator groups. Among patients, no significant difference was found in the clarity of recommendations (p = 0.771). Qualitative analysis showed that GPT-4's empathetic expressions, though sometimes generic, were perceived as effective. Conclusion: GPT-4 shows strong potential in generating empathetic ED discharge letters. These letters are preferred by healthcare professionals and patients. GPT-4 offers a promising tool to reduce the workload of ED physicians. Further research is necessary to explore patient perceptions and best practices for integrating AI with physicians in clinical practice.

Original languageEnglish
JournalDigital Health
Volume11
DOIs
StatePublished - 1 Nov 2025

Bibliographical note

Publisher Copyright:
© The Author(s) 2025. This article is distributed under the terms of the Creative Commons Attribution-NonCommercial 4.0 License (https://creativecommons.org/licenses/by-nc/4.0/) which permits non-commercial use, reproduction and distribution of the work without further permission provided the original work is attributed as specified on the SAGE and Open Access page (https://us.sagepub.com/en-us/nam/open-access-at-sage).

Keywords

  • Empathy
  • GPT-4
  • clarity
  • discharge letters
  • emergency department

Fingerprint

Dive into the research topics of 'Evaluating empathy in GPT-4-generated vs. physician-written emergency department discharge letters'. Together they form a unique fingerprint.

Cite this