This paper describes a new approach to normalizing microarray expression data. The novel feature is to unify the tasks of estimating normalization coefficients and identifying control gene set. Unification is realized by constructing a window function over the scatter plot defining the subset of constantly expressed genes and by affecting optimization using an iterative procedure. The structure of window function gates contributions to the control gene set used to estimate normalization coefficients. This window measures the consistency of the matched neighborhoods in the scatter plot and provides a means of rejecting control gene outliers. The recovery of normalizational regression and control gene selection are interleaved and are realized by applying coupled operations to the mean square error function. In this way, the two processes bootstrap one another. We evaluate the technique on real microarray data from breast cancer cell lines and complement the experiment with a data cluster visualization study.
|Original language||English (US)|
|Number of pages||9|
|Journal||IEEE Transactions on Information Technology in Biomedicine|
|State||Published - Mar 2002|
Bibliographical noteFunding Information:
Manuscript received March 28, 2001; revised September 14, 2001. This work was supported in part by the National Institutes of Health under Grants 5R21CA83231 and R01CA/AG58022. Y. Wang and J. Lu are with the Department of Electrical Engineering and Computer Science, The Catholic University of America, Washington, DC 20064 USA. R. Lee and R. Clarke are with the Lombardi Cancer Center, Georgetown University Medical Center, Washington, DC 20007 USA. Z. Gu is with the Celera Genomics, Inc., Rockville, MD 20850 USA. Publisher Item Identifier S 1089-7771(02)02009-5.
- Data normalization
- Dynamic programming
- Gene expression
- Gene microarray
- Linear regression