Learning About Internal Migration From Half a Billion Individual Records: Applying Localized Classification Trees to Large-Scale Census Data

Guy J. Abel, Asian Demographic Research Institute
Raya Muttarak , Vienna Institute of Demography
Fabian Stephany, University of Cambridge and Wittgenstein Centre (IIASA, VID/ÖAW, WU)

Understanding who migrates is crucial in explaining societal changes and forecasting future populations. However, there is no empirical consensus on demographic and socioeconomic factors driving migration decision. Exploiting micro census data from the Integrated Public Use Microdata Series International (IPUMSI) database across 65 countries over the period 1960 to 2012 covering 477,296,432 individual records, this study aims to establish common demographic drivers of migration. Given an exceptionally large number of observations, a parametric approach would simply yield bias estimates of standard errors of the variables of interest. We apply a machine learning technique using decision tree models to establish common demographic patterns driving migration in our data. We find that globally, age, education, household size, and urbanisation are important drivers of internal migration. Age and education are particularly important predictors in Europe and Northern America whilst in South and Central America and Africa, urbanisation and household size are more relevant.

See extended abstract

 Presented in Session 3. Population, Development, & the Environment; Data & Methods; Applied Demography