Background: The readability of online bariatric surgery patient education materials (PEMs) often ... more Background: The readability of online bariatric surgery patient education materials (PEMs) often surpasses the recommended 6th grade level. Large language models (LLMs), like ChatGPT and Bard, have the potential to revolutionize PEM delivery. We aimed to evaluate the readability of PEMs produced by U.S. medical institutions compared to LLMs, as well as the ability of LLMs to simplify their responses. Methods: Responses to frequently asked questions (FAQs) related to bariatric surgery were gathered from top-ranked health institutions. FAQ responses were also generated from GPT-3.5, GPT-4, and Bard. LLMs were then prompted to improve the readability of their initial responses. The readability of institutional responses, initial LLM responses, and simplified LLM responses were graded using validated readability formulas. Accuracy and comprehensiveness of initial and simplified LLM responses were also compared. Results: Responses to 66 FAQs were included. All institutional and initial LLM responses had poor readability, with average reading levels ranging from 9th grade to college graduate. Simplified responses from LLMs had significantly improved readability, with reading levels ranging from 6th grade to college freshman. When comparing simplified LLM responses, GPT-4 responses demonstrated the highest readability, with reading levels ranging from 6th to 9th grade. Accuracy was similar between initial and simplified responses from all LLMs. Comprehensiveness was similar between initial and simplified responses from GPT-3.5 and GPT-4. However, 34.8% of Bard's simplified responses were graded as less comprehensive compared to initial. Conclusion: Our study highlights the efficacy of LLMs in enhancing the readability of bariatric surgery PEMs. GPT-4 outperformed other models, generating simplified PEMs from 6th to 9th grade reading levels. Unlike GPT-3.5 and GPT-4, Bard's simplified responses were graded as less comprehensive. We advocate for future studies examining the potential role of LLMs as dynamic and personalized sources of PEMs for diverse patient populations of all literacy levels.
• GPT-4 provided accurate and comprehensive responses to questions related to bariatric surgery •... more • GPT-4 provided accurate and comprehensive responses to questions related to bariatric surgery • GPT-4 and GPT-3.5 provided responses that are relatively comparable in accuracy • GPT-4 provided more comprehensive responses compared to
Purpose ChatGPT is a large language model trained on a large dataset covering a broad range of to... more Purpose ChatGPT is a large language model trained on a large dataset covering a broad range of topics, including the medical literature. We aim to examine its accuracy and reproducibility in answering patient questions regarding bariatric surgery. Materials and methods Questions were gathered from nationally regarded professional societies and health institutions as well as Facebook support groups. Board-certified bariatric surgeons graded the accuracy and reproducibility of responses. The grading scale included the following: (1) comprehensive, (2) correct but inadequate, (3) some correct and some incorrect, and (4) completely incorrect. Reproducibility was determined by asking the model each question twice and examining difference in grading category between the two responses. Results In total, 151 questions related to bariatric surgery were included. The model provided "comprehensive" responses to 131/151 (86.8%) of questions. When examined by category, the model provided "comprehensive" responses to 93.8% of questions related to "efficacy, eligibility and procedure options"; 93.3% related to "preoperative preparation"; 85.3% related to "recovery, risks, and complications"; 88.2% related to "lifestyle changes"; and 66.7% related to "other". The model provided reproducible answers to 137 (90.7%) of questions. The large language model ChatGPT often provided accurate and reproducible responses to common questions related to bariatric surgery. ChatGPT may serve as a helpful adjunct information resource for patients regarding bariatric surgery in addition to standard of care provided by licensed healthcare professionals. We encourage future studies to examine how to leverage this disruptive technology to improve patient outcomes and quality of life.
PURPOSE: The purpose of this exploratory study was to investigate the types of academic and healt... more PURPOSE: The purpose of this exploratory study was to investigate the types of academic and health-related accommodations provided to adolescents and emerging adults with spina bifida aged 9-20 years. METHODS: Data was extracted from the paper and electronic records of transition-age youth enrolled in the study. Four open ended items involved content analysis. RESULTS: The most frequently identified accommodation was enrollment in special education classes in 47.7% of the charts. Other academic accommodations that were most often reported were adaptive physical education (n = 71, 39.9%), tutoring (n = 28; 15.7%), and home schooling (n = 21; 11.8%). Clean intermittent catheterization was the most frequently identified health-related accommodation provided by the school nurse/aide (n = 57; 32%).The largest percentage of requests for additional accommodations were made during the middle school grades (15; 54.8%) followed by high school (10; 32.2%). CONCLUSION: Findings demonstrated tha...
Background: The readability of online bariatric surgery patient education materials (PEMs) often ... more Background: The readability of online bariatric surgery patient education materials (PEMs) often surpasses the recommended 6th grade level. Large language models (LLMs), like ChatGPT and Bard, have the potential to revolutionize PEM delivery. We aimed to evaluate the readability of PEMs produced by U.S. medical institutions compared to LLMs, as well as the ability of LLMs to simplify their responses. Methods: Responses to frequently asked questions (FAQs) related to bariatric surgery were gathered from top-ranked health institutions. FAQ responses were also generated from GPT-3.5, GPT-4, and Bard. LLMs were then prompted to improve the readability of their initial responses. The readability of institutional responses, initial LLM responses, and simplified LLM responses were graded using validated readability formulas. Accuracy and comprehensiveness of initial and simplified LLM responses were also compared. Results: Responses to 66 FAQs were included. All institutional and initial LLM responses had poor readability, with average reading levels ranging from 9th grade to college graduate. Simplified responses from LLMs had significantly improved readability, with reading levels ranging from 6th grade to college freshman. When comparing simplified LLM responses, GPT-4 responses demonstrated the highest readability, with reading levels ranging from 6th to 9th grade. Accuracy was similar between initial and simplified responses from all LLMs. Comprehensiveness was similar between initial and simplified responses from GPT-3.5 and GPT-4. However, 34.8% of Bard's simplified responses were graded as less comprehensive compared to initial. Conclusion: Our study highlights the efficacy of LLMs in enhancing the readability of bariatric surgery PEMs. GPT-4 outperformed other models, generating simplified PEMs from 6th to 9th grade reading levels. Unlike GPT-3.5 and GPT-4, Bard's simplified responses were graded as less comprehensive. We advocate for future studies examining the potential role of LLMs as dynamic and personalized sources of PEMs for diverse patient populations of all literacy levels.
• GPT-4 provided accurate and comprehensive responses to questions related to bariatric surgery •... more • GPT-4 provided accurate and comprehensive responses to questions related to bariatric surgery • GPT-4 and GPT-3.5 provided responses that are relatively comparable in accuracy • GPT-4 provided more comprehensive responses compared to
Purpose ChatGPT is a large language model trained on a large dataset covering a broad range of to... more Purpose ChatGPT is a large language model trained on a large dataset covering a broad range of topics, including the medical literature. We aim to examine its accuracy and reproducibility in answering patient questions regarding bariatric surgery. Materials and methods Questions were gathered from nationally regarded professional societies and health institutions as well as Facebook support groups. Board-certified bariatric surgeons graded the accuracy and reproducibility of responses. The grading scale included the following: (1) comprehensive, (2) correct but inadequate, (3) some correct and some incorrect, and (4) completely incorrect. Reproducibility was determined by asking the model each question twice and examining difference in grading category between the two responses. Results In total, 151 questions related to bariatric surgery were included. The model provided "comprehensive" responses to 131/151 (86.8%) of questions. When examined by category, the model provided "comprehensive" responses to 93.8% of questions related to "efficacy, eligibility and procedure options"; 93.3% related to "preoperative preparation"; 85.3% related to "recovery, risks, and complications"; 88.2% related to "lifestyle changes"; and 66.7% related to "other". The model provided reproducible answers to 137 (90.7%) of questions. The large language model ChatGPT often provided accurate and reproducible responses to common questions related to bariatric surgery. ChatGPT may serve as a helpful adjunct information resource for patients regarding bariatric surgery in addition to standard of care provided by licensed healthcare professionals. We encourage future studies to examine how to leverage this disruptive technology to improve patient outcomes and quality of life.
PURPOSE: The purpose of this exploratory study was to investigate the types of academic and healt... more PURPOSE: The purpose of this exploratory study was to investigate the types of academic and health-related accommodations provided to adolescents and emerging adults with spina bifida aged 9-20 years. METHODS: Data was extracted from the paper and electronic records of transition-age youth enrolled in the study. Four open ended items involved content analysis. RESULTS: The most frequently identified accommodation was enrollment in special education classes in 47.7% of the charts. Other academic accommodations that were most often reported were adaptive physical education (n = 71, 39.9%), tutoring (n = 28; 15.7%), and home schooling (n = 21; 11.8%). Clean intermittent catheterization was the most frequently identified health-related accommodation provided by the school nurse/aide (n = 57; 32%).The largest percentage of requests for additional accommodations were made during the middle school grades (15; 54.8%) followed by high school (10; 32.2%). CONCLUSION: Findings demonstrated tha...
Uploads
Papers by Nithya Rajeev