TY - JOUR
T1 - Evaluating empathy in GPT-4-generated vs. physician-written emergency department discharge letters
AU - Ben-Haim, Gal
AU - Livne, Adva
AU - Manor, Uri
AU - Hochstein, David
AU - Saban, Mor
AU - Blaier, Orly
AU - Iram, Yael Abramov
AU - Balzam, Moran Gigi
AU - Lutenberg, Ariel
AU - Eyade, Rowand
AU - Qassem, Roula
AU - Trabelsi, Dan
AU - Dahari, Yarden
AU - Eisenmann, Ben Zion
AU - Shechtman, Yelena
AU - Nadkarni, Girish N.
AU - Glicksberg, Benjamin S.
AU - Zimlichman, Eyal
AU - Perry, Anat
AU - Klang, Eyal
N1 - Publisher Copyright:
© The Author(s) 2025. This article is distributed under the terms of the Creative Commons Attribution-NonCommercial 4.0 License (https://creativecommons.org/licenses/by-nc/4.0/) which permits non-commercial use, reproduction and distribution of the work without further permission provided the original work is attributed as specified on the SAGE and Open Access page (https://us.sagepub.com/en-us/nam/open-access-at-sage).
PY - 2025/11/1
Y1 - 2025/11/1
N2 - Background and Aims: Empathy improves clinical outcomes, patient satisfaction, and adherence to treatment. Few studies have explored the real-world use of large language models in conveying empathy. We compared the empathy in emergency department (ED) discharge letters written by GPT-4 and physicians. Methods: We conducted a retrospective, blinded, comparative study in a tertiary ED. All patients discharged for one 8-h shift were included. For each patient, we compared the original ED discharge letter to a GPT-4 generated letter. GPT-4 generated the letters using ED notes, excluding the original discharge letter. Seventeen evaluators (seven physicians, five nurses, five patients) compared the letters side by side. They were blinded to the source. Evaluators first chose between the AI and human letters. Then they rated each letter for empathy, overall quality, clarity of summary, and clarity of recommendations using a 5-point Likert scale. Results: Evaluators preferred GPT-4 over physician letters in 83.7% of comparisons (1009 vs. 197; p < 0.001). GPT-4 letters received higher scores for empathy (median 4.0 vs. 3.0; p < 0.001), overall quality, and clarity of summary across all evaluator groups. Among patients, no significant difference was found in the clarity of recommendations (p = 0.771). Qualitative analysis showed that GPT-4's empathetic expressions, though sometimes generic, were perceived as effective. Conclusion: GPT-4 shows strong potential in generating empathetic ED discharge letters. These letters are preferred by healthcare professionals and patients. GPT-4 offers a promising tool to reduce the workload of ED physicians. Further research is necessary to explore patient perceptions and best practices for integrating AI with physicians in clinical practice.
AB - Background and Aims: Empathy improves clinical outcomes, patient satisfaction, and adherence to treatment. Few studies have explored the real-world use of large language models in conveying empathy. We compared the empathy in emergency department (ED) discharge letters written by GPT-4 and physicians. Methods: We conducted a retrospective, blinded, comparative study in a tertiary ED. All patients discharged for one 8-h shift were included. For each patient, we compared the original ED discharge letter to a GPT-4 generated letter. GPT-4 generated the letters using ED notes, excluding the original discharge letter. Seventeen evaluators (seven physicians, five nurses, five patients) compared the letters side by side. They were blinded to the source. Evaluators first chose between the AI and human letters. Then they rated each letter for empathy, overall quality, clarity of summary, and clarity of recommendations using a 5-point Likert scale. Results: Evaluators preferred GPT-4 over physician letters in 83.7% of comparisons (1009 vs. 197; p < 0.001). GPT-4 letters received higher scores for empathy (median 4.0 vs. 3.0; p < 0.001), overall quality, and clarity of summary across all evaluator groups. Among patients, no significant difference was found in the clarity of recommendations (p = 0.771). Qualitative analysis showed that GPT-4's empathetic expressions, though sometimes generic, were perceived as effective. Conclusion: GPT-4 shows strong potential in generating empathetic ED discharge letters. These letters are preferred by healthcare professionals and patients. GPT-4 offers a promising tool to reduce the workload of ED physicians. Further research is necessary to explore patient perceptions and best practices for integrating AI with physicians in clinical practice.
KW - Empathy
KW - GPT-4
KW - clarity
KW - discharge letters
KW - emergency department
UR - https://www.scopus.com/pages/publications/105022137870
U2 - 10.1177/20552076251389992
DO - 10.1177/20552076251389992
M3 - ???researchoutput.researchoutputtypes.contributiontojournal.article???
C2 - 41246211
AN - SCOPUS:105022137870
SN - 2055-2076
VL - 11
JO - Digital Health
JF - Digital Health
ER -