Comparative Analysis of ChatGPT and Human Expertise in Diagnosing Primary Liver Carcinoma: A Focus on Gross Morphology
DOI:
https://doi.org/10.33192/smj.v77i2.271596Keywords:
Artificial intelligence, ChatGPT, GPT-4, Liver cancer, Hepatocellular carcinoma, CholangiocarcinomaAbstract
Objective: This study aims to compare the diagnostic accuracy of customized ChatGPT and human experts in identifying primary liver carcinoma using gross morphology.
Materials and Methods: Gross morphology images of hepatocellular carcinoma (HCC) and cholangiocarcinoma (CCA) cases were assessed. These images were analyzed by two versions of customized ChatGPT (e.g., with and without a scoring system), pathology residents, and pathologist assistants. The diagnostic accuracy and consistency of each participant group were evaluated.
Results: The study analyzed 128 liver carcinoma images (62 HCC, 66 CCA) were analyzed, with the participation of 13 pathology residents (median experience of 1.5 years) and three pathologist assistants (median experience of 5 years). When augmented with a scoring system, ChatGPT’s performance was found to align closely with first- and second-year pathology residents and was inferior to third-year pathology residents and pathologist assistants, with statistical significance (p-values < 0.01). In contrast, the diagnostic accuracy of ChatGPT, when operating without the scoring system, was significantly lower than that of all human participants (p-values < 0.01). Kappa statistics indicated that the diagnostic consistency was slight to fair for both customized versions of ChatGPT and the pathology residents. It was noted that the interobserver agreement among the pathologist assistants was moderate.
Conclusion: The study highlights the potential of ChatGPT for augmenting diagnostic processes in pathology. However, it also emphasizes the current limitations of this AI tool compared to human expertise, particularly among experienced participants. This suggests the importance of integrating AI with human judgment in diagnostic
pathology.
References
Rumgay H, Ferlay J, de Martel C, Georges D, Ibrahim AS, Zheng R, et al. Global, regional and national burden of primary liver cancer by subtype. Eur J Cancer. 2022;161:108-18.
Fan Z, Jin M, Zhang L, Wang N, Li M, Wang C, et al. From clinical variables to multiomics analysis: a margin morphology-based gross classification system for hepatocellular carcinoma stratification. Gut. 2023;72(11):2149-63.
Zen Y. Intrahepatic cholangiocarcinoma: typical features, uncommon variants, and controversial related entities. Hum Pathol. 2023;132:197-207.
Lee P, Bubeck S, Petro J. Benefits, Limits, and Risks of GPT-4 as an AI Chatbot for Medicine. N Engl J Med. 2023;388(13):1233-9.
Mihalache A, Huang RS, Popovic MM, Muni RH. ChatGPT-4: An assessment of an upgraded artificial intelligence chatbot in the United States Medical Licensing Examination. Med Teach. 2023:1-7.
Geetha SD, Khan A, Khan A, Kannadath BS, Vitkovski T. Evaluation of ChatGPT pathology knowledge using board-style questions. Am J Clin Pathol. 2024;161(4):393-98.
OpenAI. GPT-4V(Ision) system card2023. Available from: https://cdn.openai.com/papers/GPTV_System_Card.pdf.
Landis JR, Koch GG. The measurement of observer agreement for categorical data. Biometrics. 1977;33(1):159-74.
Laohawetwanit T, Namboonlue C, Apornvirat S. Accuracy of GPT-4 in histopathological image detection and classification of colorectal adenomas. J Clin Pathol. 2024:jcp-2023-209304.
Apornvirat S, Thinpanja W, Damrongkiet K, Benjakul N, Laohawetwanit T. Comparing customized ChatGPT and pathology residents in histopathologic description and diagnosis of common diseases. Ann Diagn Pathol. 2024;73:152359.
Laohawetwanit T, Apornvirat S, Namboonlue C. Thinking like a pathologist: Morphologic approach to hepatobiliary tumors by ChatGPT. Am J Clin Pathol. 2024:aqae087.
Zhang Y, Liu H, Sheng B, Tham YC, Ji H. Preliminary Fatty Liver Disease Grading Using General-Purpose Online Large Language Models: ChatGPT-4 or Bard? J Hepatol. 2024;80(6):e279-81.
OpenAI. ChatGPT plugins2023. Available from: https://openai.com/blog/chatgpt-plugins.
Yang WH, Yang YJ, Chen TJ. ChatGPT's innovative application in blood morphology recognition. J Chin Med Assoc. 2024;87(4):428-33.
Laohawetwanit T, Pinto DG, Bychkov A. A survey analysis of the adoption of large language models among pathologists. Am J Clin Pathol. 2024:aqae093.
OpenAI. Introducing ChatGPT2022. Available from: https://openai.com/blog/chatgpt.
OpenAI. GPT-4 Technical Report2023. Available from: https://arxiv.org/pdf/2303.08774.pdf.
OpenAI. Introducing GPTs2023. Available from: https://openai.com/blog/introducing-gpts.
Brin D, Sorin V, Vaid A, Soroush A, Glicksberg BS, Charney AW, et al. Comparing ChatGPT and GPT-4 performance in USMLE soft skill assessments. Sci Rep. 2023;13(1):16492.
Koga S. Exploring the pitfalls of large language models: Inconsistency and inaccuracy in answering pathology board examination-style questions. Pathol Int. 2023;73(12):618-20.
Wang AY, Lin S, Tran C, Homer RJ, Wilsdon D, Walsh JC, et al. Assessment of Pathology Domain-Specific Knowledge of ChatGPT and Comparison to Human Performance. Arch Pathol Lab Med. 2024.
Johnson D, Goodman R, Patrinely J, Stone C, Zimmerman E, Donald R, et al. Assessing the Accuracy and Reliability of AI-Generated Medical Responses: An Evaluation of the Chat-GPT Model. Res Sq. 2023.
Oon ML, Syn NL, Tan CL, Tan KB, Ng SB. Bridging bytes and biopsies: A comparative analysis of ChatGPT and histopathologists in pathology diagnosis and collaborative potential. Histopathology. 2024;84(4):601-13.
Mesko B. Prompt Engineering as an Important Emerging Skill for Medical Professionals: Tutorial. J Med Internet Res. 2023;25:e50638.
Zhang HW, Huang DL, Wang YR, Zhong HS, Pang HW. CT radiomics based on different machine learning models for classifying gross tumor volume and normal liver tissue in hepatocellular carcinoma. Cancer Imaging. 2024;24(1):20.
Gross M, Haider SP, Ze'evi T, Huber S, Arora S, Kucukkaya AS, et al. Automated graded prognostic assessment for patients with hepatocellular carcinoma using machine learning. Eur Radiol. 2024;34(10):6940-52.
Apornvirat S, Namboonlue C, Laohawetwanit T. Comparative analysis of ChatGPT and Bard in answering pathology examination questions requiring image interpretation. Am J Clin Pathol. 2024;162(3):252-60.
Sukpanichnant S. Malignancy of the lymph node: How general practitioners and pathologists can achieve a definitive diagnosis. Siriraj Med J. 2022;74(9):604–17.
Laohawetwanit T, Apornvirat S, Kantasiripitak C. ChatGPT as a teaching tool: Preparing pathology residents for board examination with AI-generated digestive system pathology tests. Am J Clin Pathol. 2024;162(5):471-9.
Published
How to Cite
License
Copyright (c) 2024 Siriraj Medical Journal

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Authors who publish with this journal agree to the following conditions:
Copyright Transfer
In submitting a manuscript, the authors acknowledge that the work will become the copyrighted property of Siriraj Medical Journal upon publication.
License
Articles are licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License (CC BY-NC-ND 4.0). This license allows for the sharing of the work for non-commercial purposes with proper attribution to the authors and the journal. However, it does not permit modifications or the creation of derivative works.
Sharing and Access
Authors are encouraged to share their article on their personal or institutional websites and through other non-commercial platforms. Doing so can increase readership and citations.



