Predicting Somatic Complaints from Personality Traits and Smartphone-Derived Behavioral Data Using Machine Learning
Keywords:
somatic complaints, personality traits, smartphone sensing, digital phenotyping, machine learningAbstract
This study aimed to predict somatic complaints by integrating Big Five personality traits and passive smartphone-derived behavioral indicators within a machine learning framework. A cross-sectional observational design was employed with an adult community sample recruited in Armenia. Participants completed standardized self-report measures assessing somatic complaints and Big Five personality traits. In parallel, passive smartphone sensing data were collected continuously over a four-week period using a custom Android application that captured behavioral indicators such as screen time, sleep regularity proxies, physical inactivity duration, nighttime phone activity, and app usage patterns, without recording personal content. Data preprocessing included cleaning, aggregation of behavioral features at the participant level, and standardization of predictors. Supervised machine learning models were trained to predict somatic complaint scores, including regularized linear models, support vector regression, random forest, and gradient boosting. Model evaluation used train–test splits with cross-validation, and performance was assessed using error-based indices and explained variance metrics. Nonlinear ensemble models significantly outperformed linear approaches in predicting somatic complaints, with gradient boosting explaining the largest proportion of variance. Models combining personality traits and smartphone-derived behavioral data demonstrated significantly higher predictive accuracy than models using either data source alone. Neuroticism showed the strongest positive contribution to somatic complaints, while conscientiousness and extraversion contributed negatively. Among behavioral indicators, sleep regularity, daily screen time, nighttime phone activity, and physical inactivity emerged as significant predictors. Model performance remained stable across gender and age subgroups, indicating robustness of the predictive relationships. Integrating personality traits with passive smartphone-derived behavioral data using machine learning provides a powerful and ecologically valid approach for predicting somatic complaints, supporting biopsychosocial models of psychosomatic health and highlighting the potential of digital phenotyping for personalized health assessment.
Downloads

