Predicting Somatic Symptom Disorder Using Ensemble Learning on Personality, Stress, and Emotion Regulation Data
Keywords:
Somatic symptom disorder, ensemble learning, personality traits, perceived stress, emotion regulation, machine learningAbstract
The objective of this study was to develop and evaluate ensemble learning models to predict somatic symptom disorder based on personality traits, perceived stress, and emotion regulation variables. This cross-sectional study was conducted on an adult sample from South Africa using a predictive modeling design. Participants completed standardized self-report measures assessing somatic symptom severity, personality traits, perceived stress, and multiple dimensions of emotion regulation. Data preprocessing included handling missing values, standardization of continuous variables, and feature preparation for machine learning analysis. Several base classifiers were trained and integrated using ensemble learning techniques, including random forest, gradient boosting, and stacking. Model training and evaluation were performed using stratified k-fold cross-validation to ensure robustness and reduce overfitting. Predictive performance was assessed using multiple inferential metrics, including accuracy, sensitivity, specificity, precision, F1-score, and area under the receiver operating characteristic curve (AUC). Feature importance and incremental modeling analyses were conducted to examine the relative and combined contributions of predictor domains. The ensemble learning models demonstrated strong predictive performance, with the stacking ensemble yielding the highest accuracy and AUC, indicating excellent discrimination between higher and lower somatic symptom severity. Neuroticism and perceived stress emerged as the most influential predictors, followed by key emotion regulation difficulties, particularly impulse control problems and limited access to adaptive regulation strategies. Incremental analyses showed that models incorporating personality traits alone achieved moderate prediction, which significantly improved with the addition of perceived stress and further improved when emotion regulation variables were included.
Downloads

