TY - JOUR
T1 - Embracing the informative missingness and silent gene in analyzing biologically diverse samples
AU - Du, Dongping
AU - Bhardwaj, Saurabh
AU - Lu, Yingzhou
AU - Wang, Yizhi
AU - Parker, Sarah J.
AU - Zhang, Zhen
AU - Van Eyk, Jennifer E.
AU - Yu, Guoqiang
AU - Clarke, Robert
AU - Herrington, David M.
AU - Wang, Yue
N1 - Publisher Copyright:
© The Author(s) 2024.
PY - 2024/12
Y1 - 2024/12
N2 - Bioinformatics software tools are essential to identify informative molecular features that define different phenotypic sample groups. Among the most fundamental and interrelated tasks are missing value imputation, signature gene detection, and differential pattern visualization. However, many commonly used analytics tools can be problematic when handling biologically diverse samples if either informative missingness possess high missing rates with mixed missing mechanisms, or multiple sample groups are compared and visualized in parallel. We developed the ABDS tool suite specifically for analyzing biologically diverse samples. Collectively, a mechanism-integrated group-wise pre-imputation scheme is proposed to retain informative missingness associated with signature genes, a cosine-based one-sample test is extended to detect group-silenced signature genes, and a unified heatmap is designed to display multiple sample groups. We describe the methodological principles and demonstrate the effectiveness of three analytics tools under targeted scenarios, supported by comparative evaluations and biomedical showcases. As an open-source R package, ABDS tool suite complements rather than replaces existing tools and will allow biologists to more accurately detect interpretable molecular signals among phenotypically diverse sample groups.
AB - Bioinformatics software tools are essential to identify informative molecular features that define different phenotypic sample groups. Among the most fundamental and interrelated tasks are missing value imputation, signature gene detection, and differential pattern visualization. However, many commonly used analytics tools can be problematic when handling biologically diverse samples if either informative missingness possess high missing rates with mixed missing mechanisms, or multiple sample groups are compared and visualized in parallel. We developed the ABDS tool suite specifically for analyzing biologically diverse samples. Collectively, a mechanism-integrated group-wise pre-imputation scheme is proposed to retain informative missingness associated with signature genes, a cosine-based one-sample test is extended to detect group-silenced signature genes, and a unified heatmap is designed to display multiple sample groups. We describe the methodological principles and demonstrate the effectiveness of three analytics tools under targeted scenarios, supported by comparative evaluations and biomedical showcases. As an open-source R package, ABDS tool suite complements rather than replaces existing tools and will allow biologists to more accurately detect interpretable molecular signals among phenotypically diverse sample groups.
UR - http://www.scopus.com/inward/record.url?scp=85209357409&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85209357409&partnerID=8YFLogxK
U2 - 10.1038/s41598-024-78076-0
DO - 10.1038/s41598-024-78076-0
M3 - Article
C2 - 39550430
AN - SCOPUS:85209357409
SN - 2045-2322
VL - 14
JO - Scientific reports
JF - Scientific reports
IS - 1
M1 - 28265
ER -