Network Model-Assisted Inference from Respondent-Driven Sampling Data

Abstract

Respondent-driven sampling is a widely used method for sampling hard-to-reach human populations by link tracing over their social networks. Inference from such data requires specialized techniques because the sampling process is both partially beyond the control of the researcher, and partially implicitly defined. Therefore, it is not generally possible to compute the sampling weights for traditional design-based inference directly, and likelihood inference requires modelling the complex sampling process. As an alternative, we introduce a model-assisted approach, resulting in a design-based estimator leveraging a working network model. We derive a new class of estimators for population means and a corresponding bootstrap standard error estimator. We demonstrate improved performance compared with existing estimators, including adjustment for an initial convenience sample. We also apply the method and an extension to the estimation of the prevalence of human immunodeficiency virus in a high-risk population.

Publication
Journal of the Royal Statistical Society, A