Development and Evaluation of a Mental Health Chatbot Using ChatGPT 4.0: Mixed Methods User Experience Study With Korean Users
Development and Evaluation of a Mental Health Chatbot Using ChatGPT 4.0: Mixed Methods User Experience Study With Korean Users

Development and Evaluation of a Mental Health Chatbot Using ChatGPT 4.0: Mixed Methods User Experience Study With Korean Users

JMIR Med Inform. 2025 Jan 3;13:e63538. doi: 10.2196/63538.

ABSTRACT

BACKGROUND: Mental health chatbots have emerged as a promising tool for providing accessible and convenient support to individuals in need. Building on our previous research on digital interventions for loneliness and depression among Korean college students, this study addresses the limitations identified and explores more advanced artificial intelligence-driven solutions.

OBJECTIVE: This study aimed to develop and evaluate the performance of HoMemeTown Dr. CareSam, an advanced cross-lingual chatbot using ChatGPT 4.0 (OpenAI) to provide seamless support in both English and Korean contexts. The chatbot was designed to address the need for more personalized and culturally sensitive mental health support identified in our previous work while providing an accessible and user-friendly interface for Korean young adults.

METHODS: We conducted a mixed methods pilot study with 20 Korean young adults aged 18 to 27 (mean 23.3, SD 1.96) years. The HoMemeTown Dr CareSam chatbot was developed using the GPT application programming interface, incorporating features such as a gratitude journal and risk detection. User satisfaction and chatbot performance were evaluated using quantitative surveys and qualitative feedback, with triangulation used to ensure the validity and robustness of findings through cross-verification of data sources. Comparative analyses were conducted with other large language models chatbots and existing digital therapy tools (Woebot [Woebot Health Inc] and Happify [Twill Inc]).

RESULTS: Users generally expressed positive views towards the chatbot, with positivity and support receiving the highest score on a 10-point scale (mean 9.0, SD 1.2), followed by empathy (mean 8.7, SD 1.6) and active listening (mean 8.0, SD 1.8). However, areas for improvement were noted in professionalism (mean 7.0, SD 2.0), complexity of content (mean 7.4, SD 2.0), and personalization (mean 7.4, SD 2.4). The chatbot demonstrated statistically significant performance differences compared with other large language models chatbots (F=3.27; P=.047), with more pronounced differences compared with Woebot and Happify (F=12.94; P<.001). Qualitative feedback highlighted the chatbot’s strengths in providing empathetic responses and a user-friendly interface, while areas for improvement included response speed and the naturalness of Korean language responses.

CONCLUSIONS: The HoMemeTown Dr CareSam chatbot shows potential as a cross-lingual mental health support tool, achieving high user satisfaction and demonstrating comparative advantages over existing digital interventions. However, the study’s limited sample size and short-term nature necessitate further research. Future studies should include larger-scale clinical trials, enhanced risk detection features, and integration with existing health care systems to fully realize its potential in supporting mental well-being across different linguistic and cultural contexts.

PMID:39752663 | DOI:10.2196/63538