Constructing synthetic samples

Hua Dong, Glen Meeden

Research output: Contribution to journalArticlepeer-review


We consider the problem of constructing a synthetic sample from a population of interest which cannot be sampled from but for which the population means of some of its variables are known. In addition, we assume that we have in hand samples from two similar populations. Using the known population means, we will select subsamples from the samples of the other two populations which we will then combine to construct the synthetic sample. The synthetic sample is obtained by solving an optimization problem, where the known population means, are used as constraints. The optimization is achieved through an adaptive random search algorithm. Simulation studies are presented to demonstrate the effectiveness of our approach. We observe that on average, such synthetic samples behave very much like actual samples from the population of interest. As an application we consider constructing a one-percent synthetic sample for the missing 1890 decennial sample of the United States.

Original languageEnglish (US)
Pages (from-to)113-127
Number of pages15
JournalJournal of Official Statistics
Issue number1
StatePublished - Mar 2016

Bibliographical note

Publisher Copyright:
© Statistics Sweden.


  • Missing data
  • Sample survey
  • Synthetic samples


Dive into the research topics of 'Constructing synthetic samples'. Together they form a unique fingerprint.

Cite this