An integrative ENCODE resource for cancer genomics

ENCODE Project Consortium

Research output: Contribution to journalArticlepeer-review

69 Scopus citations


ENCODE comprises thousands of functional genomics datasets, and the encyclopedia covers hundreds of cell types, providing a universal annotation for genome interpretation. However, for particular applications, it may be advantageous to use a customized annotation. Here, we develop such a custom annotation by leveraging advanced assays, such as eCLIP, Hi-C, and whole-genome STARR-seq on a number of data-rich ENCODE cell types. A key aspect of this annotation is comprehensive and experimentally derived networks of both transcription factors and RNA-binding proteins (TFs and RBPs). Cancer, a disease of system-wide dysregulation, is an ideal application for such a network-based annotation. Specifically, for cancer-associated cell types, we put regulators into hierarchies and measure their network change (rewiring) during oncogenesis. We also extensively survey TF-RBP crosstalk, highlighting how SUB1, a previously uncharacterized RBP, drives aberrant tumor expression and amplifies the effect of MYC, a well-known oncogenic TF. Furthermore, we show how our annotation allows us to place oncogenic transformations in the context of a broad cell space; here, many normal-to-tumor transitions move towards a stem-like state, while oncogene knockdowns show an opposing trend. Finally, we organize the resource into a coherent workflow to prioritize key elements and variants, in addition to regulators. We showcase the application of this prioritization to somatic burdening, cancer differential expression and GWAS. Targeted validations of the prioritized regulators, elements and variants using siRNA knockdowns, CRISPR-based editing, and luciferase assays demonstrate the value of the ENCODE resource.

Original languageEnglish (US)
Article number3696
Pages (from-to)3696
JournalNature communications
Issue number1
StatePublished - Jul 29 2020

Bibliographical note

Funding Information:
We acknowledge support from the NIH (U24 HG 009446-02) and from the AL Williams Professorship funds. F.Y. is supported by NIH grants R35GM124820, R01HG009906, U01CA200060, and R24DK106766. R.J.K. was supported by NIH grant U01 HG007033. P.J. was supported by NIH grant 1K99CA218900-01A1.

Publisher Copyright:
© 2020, The Author(s).

PubMed: MeSH publication types

  • Research Support, Non-U.S. Gov't
  • Journal Article
  • Research Support, N.I.H., Extramural


Dive into the research topics of 'An integrative ENCODE resource for cancer genomics'. Together they form a unique fingerprint.

Cite this