Discovery of Gene Expression Signatures Using Machine Learning: Social Isolation and Genetic Expression in Adolescence and Young Adulthood

Brandt Levitt , University of North Carolina at Chapel Hill
Lauren Gaydosh, Vanderbilt University
Mike Shanahan, University of Zurich
Steve Cole, University of California, Los Angeles
Kathleen Mullan Harris, University of North Carolina at Chapel Hill

A large literature has identified social isolation as an important psychosocial determinant of health, but the biological mechanisms that explain this connection are relatively unknown. We address this gap using genome-wide transcriptome data from the National Longitudinal Study of Adolescent to Adult Health (Add Health) to examine whether social isolation operates through gene expression of innate and adaptive immune responses within the stress process system. We expand upon previous work that identified conserved immunological genes as important mediators of poor health outcomes in socially isolated individuals. We construct a quantitative measure of social isolation across multiple contexts. We use regression models and machine learning algorithms to develop a gene expression signature correlated with social isolation and seek to better understand this connection by analyzing genetic regulatory features and immunological cell subsets to identify causal patterns that explain the biological processes that make social isolation a risk factor for poor health.

See extended abstract

 Presented in Session 100. Using Big Data in Population Research