Imputation of missing values in DNA microarray gene expression data

Hyunsoo Kim, Gene H. Golub, Haesun Park

Research output: Chapter in Book/Report/Conference proceedingConference contribution

19 Scopus citations

Abstract

Most multivariate statistical methods for gene expression data require a complete matrix of gene array values. In this paper, a imputation method based on least squares formulation is proposed to estimate missing values. It exploits local similarity structures in the data as well as least squares optimization process. The proposed local least squares imputation method (LLSimpute) represents a target gene that has missing values as a linear combination of similar genes. This algorithm showed better performance than the other imputation methods such as k-nearest neighbor imputation and an imputation method base on Bayesian principal component analysis.

Original languageEnglish (US)
Title of host publicationProceedings - 2004 IEEE Computational Systems Bioinformatics Conference, CSB 2004
Pages572-573
Number of pages2
StatePublished - 2004
Externally publishedYes
EventProceedings - 2004 IEEE Computational Systems Bioinformatics Conference, CSB 2004 - Stanford, CA, United States
Duration: Aug 16 2004Aug 19 2004

Publication series

NameProceedings - 2004 IEEE Computational Systems Bioinformatics Conference, CSB 2004

Other

OtherProceedings - 2004 IEEE Computational Systems Bioinformatics Conference, CSB 2004
Country/TerritoryUnited States
CityStanford, CA
Period8/16/048/19/04

Fingerprint

Dive into the research topics of 'Imputation of missing values in DNA microarray gene expression data'. Together they form a unique fingerprint.

Cite this