Evaluating ChatGPT-4.0’s accuracy and potential in idiopathic scoliosis conservative treatment: a preliminary study on clarity, validity, and expert perceptions

Articolo

Data di Pubblicazione:

2025

Abstract:

Purpose: This study aimed to evaluate the scientific accuracy, content validity, and clarity of ChatGPT-4.0’s responses on conservative management of idiopathic scoliosis. The research explored whether the model could effectively support patient education in an area where non-surgical treatment information is crucial. Methods: Fourteen frequently asked questions (FAQs) regarding conservative scoliosis treatment were identified using a systematic, multi-step approach that combined web-based inquiry and expert input. Each question was submitted individually to ChatGPT-4.0 on December 6, 2024, using a standardized patient prompt (“I’m a scoliosis patient. Limit your answer to 150 words”). The generated responses were evaluated by a panel of 37 experts from a specialized spinal deformity center via an online survey using a 6-point Likert scale. Content validity was assessed using the Content Validity Ratio (CVR) and Content Validity Index (CVI), and inter-rater reliability was calculated with Fleiss’ kappa. Experts also provided categorical feedback on reasons for any rating discrepancies. Results: Eleven out of 14 responses met the CVR threshold (≥ 0.38), yielding an overall CVI of 0.68. Three responses - addressing “What is scoliosis?”, “Can exercises or physical therapy cure scoliosis?”, “What is the best sport for scoliosis?”- showed lower validity (CVR scores: 0.37, 0.37, and − 0.58, respectively), primarily due to factual inaccuracies and insufficient detail. Clarity received the highest ratings (median = 6), while comprehensiveness, professionalism, and response length each had a median score of 5. Inter-rater reliability was slight (Fleiss’ kappa = 0.10). Conclusion: ChatGPT-4.0 generally provides clear and accessible information on conservative idiopathic scoliosis management, supporting its potential as a patient education tool. Nonetheless, variability in response accuracy and expert evaluation underscores the need for further refinement and expert supervision before wider clinical application.

Tipologia CRIS:

Articolo su Rivista

Keywords:

Artificial intelligence; Natural Language processing; Patient education as topic; Rehabilitation; Scoliosis

Elenco autori:

Negrini, F.; Malfitano, C.; Ferriero, G.; Morone, G.; Negrini, A.; Zaina, F.; Ferrario, I.; Kiekens, C.; Negrini, S.; Vitale, J.

Autori di Ateneo:

FERRIERO GIORGIO

NEGRINI FRANCESCO

Link alla scheda completa:

https://irinsubria.uninsubria.it/handle/11383/2196293

Pubblicato in:

EUROPEAN SPINE JOURNAL

Journal