Statistical Models to Assess Data Quality in DHS Surveys

Thomas W. Pullum , ICF International
Christina Juan

Indicators to assess the quality of demographic data in a survey are typically calculated from summary distributions or tables, using methods originally developed for census data. They are limited in number and scope and are outside a framework for statistical modeling. With survey data, such as that collected by The Demographic and Health Surveys Program (DHS), a wide range of individual-level indicators of data quality can be constructed with methods that allow for multivariate analysis. Our models use binary indicators for data quality domains such as nonresponse, potential omission of births, problematic reports of age, and problems during data collection such as suspiciously short interviews. The indicators are analyzed according to characteristics of the respondent, interviewer ID, and when possible, characteristics of the interviewer. Associations among indicators can be described and differences between surveys can be tested. The focus is on the statistical models but illustrative applications are included.

See extended abstract

 Presented in Session 3. Population, Development, & the Environment; Data & Methods; Applied Demography