A high productivity/low maintenance approach to high-performance computation for biomedicine: Four case studies

Nicholas Carriero, Michael V. Osier, Kei Hoi Cheung, Perry L. Miller, Mark Gerstein, Hongyu Zhao, Baolin Wu, Scott Rifkin, Joseph Chang, Heping Zhang, Kevin White, Kenneth Williams, Martin Schultz

Research output: Contribution to journalArticle

3 Citations (Scopus)

Abstract

The rapid advances in high-throughput biotechnologies such as DNA microarrays and mass spectrometry have generated vast amounts of data ranging from gene expression to proteomics data. The large size and complexity involved in analyzing such data demand a significant amount of computing power. High-performance computation (HPC) is an attractive and increasingly affordable approach to help meet this challenge. There is a spectrum of techniques that can be used to achieve computational speedup with varying degrees of impact in terms of how drastic a change is required to allow the software to run on an HPC platform. This paper describes a high- productivity/low-maintenance (HP/LM) approach to HPC that is based on establishing a collaborative relationship between the bioinformaticist and HPC expert that respects the former's codes and minimizes the latter's efforts. The goal of this approach is to make it easy for bioinformatics researchers to continue to make iterative refinements to their programs, while still being able to take advantage of HPC. The paper describes our experience applying these HP/LM techniques in four bioinformatics case studies: (1) genome-wide sequence comparison using Blast, (2) identification of biomarkers based on statistical analysis of large mass spectrometry data sets, (3) complex genetic analysis involving ordinal phenotypes, (4) large-scale assessment of the effect of possible errors in analyzing microarray data. The case studies illustrate how the HP/LM approach can be applied to a range of representative bioinformatics applications and how the approach can lead to significant speedup of computationally intensive bioinformatics applications, while making only modest modifications to the programs themselves.

Original languageEnglish (US)
Pages (from-to)90-98
Number of pages9
JournalJournal of the American Medical Informatics Association
Volume12
Issue number1
DOIs
StatePublished - Jan 1 2005

Fingerprint

Computational Biology
Maintenance
Mass Spectrometry
Biotechnology
Oligonucleotide Array Sequence Analysis
Proteomics
Software
Biomarkers
Research Personnel
Genome
Phenotype
Gene Expression

Cite this

A high productivity/low maintenance approach to high-performance computation for biomedicine : Four case studies. / Carriero, Nicholas; Osier, Michael V.; Cheung, Kei Hoi; Miller, Perry L.; Gerstein, Mark; Zhao, Hongyu; Wu, Baolin; Rifkin, Scott; Chang, Joseph; Zhang, Heping; White, Kevin; Williams, Kenneth; Schultz, Martin.

In: Journal of the American Medical Informatics Association, Vol. 12, No. 1, 01.01.2005, p. 90-98.

Research output: Contribution to journalArticle

Carriero, N, Osier, MV, Cheung, KH, Miller, PL, Gerstein, M, Zhao, H, Wu, B, Rifkin, S, Chang, J, Zhang, H, White, K, Williams, K & Schultz, M 2005, 'A high productivity/low maintenance approach to high-performance computation for biomedicine: Four case studies', Journal of the American Medical Informatics Association, vol. 12, no. 1, pp. 90-98. https://doi.org/10.1197/jamia.M1571
Carriero, Nicholas ; Osier, Michael V. ; Cheung, Kei Hoi ; Miller, Perry L. ; Gerstein, Mark ; Zhao, Hongyu ; Wu, Baolin ; Rifkin, Scott ; Chang, Joseph ; Zhang, Heping ; White, Kevin ; Williams, Kenneth ; Schultz, Martin. / A high productivity/low maintenance approach to high-performance computation for biomedicine : Four case studies. In: Journal of the American Medical Informatics Association. 2005 ; Vol. 12, No. 1. pp. 90-98.
@article{5839e688197041fb8c94dacac8924735,
title = "A high productivity/low maintenance approach to high-performance computation for biomedicine: Four case studies",
abstract = "The rapid advances in high-throughput biotechnologies such as DNA microarrays and mass spectrometry have generated vast amounts of data ranging from gene expression to proteomics data. The large size and complexity involved in analyzing such data demand a significant amount of computing power. High-performance computation (HPC) is an attractive and increasingly affordable approach to help meet this challenge. There is a spectrum of techniques that can be used to achieve computational speedup with varying degrees of impact in terms of how drastic a change is required to allow the software to run on an HPC platform. This paper describes a high- productivity/low-maintenance (HP/LM) approach to HPC that is based on establishing a collaborative relationship between the bioinformaticist and HPC expert that respects the former's codes and minimizes the latter's efforts. The goal of this approach is to make it easy for bioinformatics researchers to continue to make iterative refinements to their programs, while still being able to take advantage of HPC. The paper describes our experience applying these HP/LM techniques in four bioinformatics case studies: (1) genome-wide sequence comparison using Blast, (2) identification of biomarkers based on statistical analysis of large mass spectrometry data sets, (3) complex genetic analysis involving ordinal phenotypes, (4) large-scale assessment of the effect of possible errors in analyzing microarray data. The case studies illustrate how the HP/LM approach can be applied to a range of representative bioinformatics applications and how the approach can lead to significant speedup of computationally intensive bioinformatics applications, while making only modest modifications to the programs themselves.",
author = "Nicholas Carriero and Osier, {Michael V.} and Cheung, {Kei Hoi} and Miller, {Perry L.} and Mark Gerstein and Hongyu Zhao and Baolin Wu and Scott Rifkin and Joseph Chang and Heping Zhang and Kevin White and Kenneth Williams and Martin Schultz",
year = "2005",
month = "1",
day = "1",
doi = "10.1197/jamia.M1571",
language = "English (US)",
volume = "12",
pages = "90--98",
journal = "Journal of the American Medical Informatics Association : JAMIA",
issn = "1067-5027",
publisher = "Oxford University Press",
number = "1",

}

TY - JOUR

T1 - A high productivity/low maintenance approach to high-performance computation for biomedicine

T2 - Four case studies

AU - Carriero, Nicholas

AU - Osier, Michael V.

AU - Cheung, Kei Hoi

AU - Miller, Perry L.

AU - Gerstein, Mark

AU - Zhao, Hongyu

AU - Wu, Baolin

AU - Rifkin, Scott

AU - Chang, Joseph

AU - Zhang, Heping

AU - White, Kevin

AU - Williams, Kenneth

AU - Schultz, Martin

PY - 2005/1/1

Y1 - 2005/1/1

N2 - The rapid advances in high-throughput biotechnologies such as DNA microarrays and mass spectrometry have generated vast amounts of data ranging from gene expression to proteomics data. The large size and complexity involved in analyzing such data demand a significant amount of computing power. High-performance computation (HPC) is an attractive and increasingly affordable approach to help meet this challenge. There is a spectrum of techniques that can be used to achieve computational speedup with varying degrees of impact in terms of how drastic a change is required to allow the software to run on an HPC platform. This paper describes a high- productivity/low-maintenance (HP/LM) approach to HPC that is based on establishing a collaborative relationship between the bioinformaticist and HPC expert that respects the former's codes and minimizes the latter's efforts. The goal of this approach is to make it easy for bioinformatics researchers to continue to make iterative refinements to their programs, while still being able to take advantage of HPC. The paper describes our experience applying these HP/LM techniques in four bioinformatics case studies: (1) genome-wide sequence comparison using Blast, (2) identification of biomarkers based on statistical analysis of large mass spectrometry data sets, (3) complex genetic analysis involving ordinal phenotypes, (4) large-scale assessment of the effect of possible errors in analyzing microarray data. The case studies illustrate how the HP/LM approach can be applied to a range of representative bioinformatics applications and how the approach can lead to significant speedup of computationally intensive bioinformatics applications, while making only modest modifications to the programs themselves.

AB - The rapid advances in high-throughput biotechnologies such as DNA microarrays and mass spectrometry have generated vast amounts of data ranging from gene expression to proteomics data. The large size and complexity involved in analyzing such data demand a significant amount of computing power. High-performance computation (HPC) is an attractive and increasingly affordable approach to help meet this challenge. There is a spectrum of techniques that can be used to achieve computational speedup with varying degrees of impact in terms of how drastic a change is required to allow the software to run on an HPC platform. This paper describes a high- productivity/low-maintenance (HP/LM) approach to HPC that is based on establishing a collaborative relationship between the bioinformaticist and HPC expert that respects the former's codes and minimizes the latter's efforts. The goal of this approach is to make it easy for bioinformatics researchers to continue to make iterative refinements to their programs, while still being able to take advantage of HPC. The paper describes our experience applying these HP/LM techniques in four bioinformatics case studies: (1) genome-wide sequence comparison using Blast, (2) identification of biomarkers based on statistical analysis of large mass spectrometry data sets, (3) complex genetic analysis involving ordinal phenotypes, (4) large-scale assessment of the effect of possible errors in analyzing microarray data. The case studies illustrate how the HP/LM approach can be applied to a range of representative bioinformatics applications and how the approach can lead to significant speedup of computationally intensive bioinformatics applications, while making only modest modifications to the programs themselves.

UR - http://www.scopus.com/inward/record.url?scp=19944365173&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=19944365173&partnerID=8YFLogxK

U2 - 10.1197/jamia.M1571

DO - 10.1197/jamia.M1571

M3 - Article

C2 - 15492032

AN - SCOPUS:19944365173

VL - 12

SP - 90

EP - 98

JO - Journal of the American Medical Informatics Association : JAMIA

JF - Journal of the American Medical Informatics Association : JAMIA

SN - 1067-5027

IS - 1

ER -