Evaluating AI-Generated Emails: A Comparative Efficiency Analysis

Marina Jovic, Salaheddine Mnasri

Abstract


This study investigates the efficiency of large language models (LLMs) in producing routine, negative, and persuasive business emails for educational purposes within the context of Business Writing. Specifically, it compares the outputs generated by four widely-used LLMs (ChatGPT 3.5, Llama 2, Bing Chat, and Bard) when presented with identical email scenarios. These generated emails are evaluated using an elaborate rubric, allowing for a systematic assessment of LLMs' performance across three distinct email types. The results of the study show that the output with the same prompt varies greatly despite the rather formulaic nature of business emails. For instance, some LLMs struggle with following the requested structure and maintaining consistency in tone, while others have issues with unity and conciseness. The findings of this research hold implications for teaching business writing (rubrics, task instructions, in-class implementation), as well as for the integration of AI in professional communication at large.


Full Text:

PDF


DOI: https://doi.org/10.5430/wjel.v14n2p502

World Journal of English Language
ISSN 1925-0703(Print)  ISSN 1925-0711(Online)

Copyright © Sciedu Press

To make sure that you can receive messages from us, please add the 'sciedupress.com' domain to your e-mail 'safe list'. If you do not receive e-mail in your 'inbox', check your 'bulk mail' or 'junk mail' folders. If you have any questions, please contact: wjel@sciedupress.com

-----------------------------------------------------------------------------------------------------------------------------------------------------------