On Imputing UNHCR Data


Dyadic data from UNHCR on the size of the global refugee population are widely used. However, for a large fraction of the refugee population, these data provide no information about refugees’ country of origin, which contributes to a high nominal rate of unreported values in the data. In this article, I demonstrate that two imputation approaches outperform the current standard approach, which assumes that all unreported values are zero. The first approach interpolates the unreported values, while the second predicts them based on trends observed in other dyads. Drawing on different types of information, the two approaches’ performance is similar. Replicating a published study on the effect of refugee crises on international war and peace, I demonstrate how both approaches strengthen the author’s findings and help to minimize the risk of a null finding.

Journal article
Research and Politics