To develop a catalog of regulatory sites in two major model organisms, Drosophila melanogaster and Caenorhabditis elegans, the modERN (model organism Encyclopedia of Regulatory Networks) consortium has systematically assayed the binding sites of transcription factors (TFs). Combined with data produced by our predecessor, modENCODE (Model Organism ENCyclopedia Of DNA Elements), we now have data for 262 TFs identifying 1.23 M sites in the fly genome and 217 TFs identifying 0.67 M sites in the worm genome. Because sites from different TFs are often overlapping and tightly clustered, they fall into 91,011 and 59,150 regions in the fly and worm, respectively, and these binding sites span as little as 8.7 and 5.8 Mb in the two organisms. Clusters with large numbers of sites (so-called high occupancy target, or HOT regions) predominantly associate with broadly expressed genes, whereas clusters containing sites from just a few factors are associated with genes expressed in tissue-specific patterns. All of the strains expressing GFP-tagged TFs are available at the stock centers, and the chromatin immunoprecipitation sequencing data are available through the ENCODE Data Coordinating Center and also through a simple interface (http://epic.gs.washington.edu/modERN/) that facilitates rapid accessibility of processed data sets. These data will facilitate a vast number of scientific inquiries into the function of individual TFs in key developmental, metabolic, and defense and homeostatic regulatory pathways, as well as provide a broader perspective on how individual TFs work together in local networks and globally across the life spans of these two key model organisms.
Bibliographical noteFunding Information:
The authors thank the ENCODE Data Coordinating Center for providing data access; Elise Feingold for her support during the project; members of the Berkeley Drosophila Genome Project for their input, especially Erwin Frise for GFP image analysis; members of the University of Chicago core facilities HGAC = High-throughput Genome Analysis Core (HGAC) and Genomics Facility (GF) (P30 CA014599), especially Adam Dedier and Jigyasa Tuteja, for sequencing; Stacy Holtzman and Thomas C. Kaufman for their production of 12 tagged transcription factor (TF) lines: bab1, Bro, CG7839, Eip74EF, ERR, Ets21C, gcm, lola, polybromo, pros, Sp1, and ss made during the bridge period between modENCODE and modERN; Rebecca Spokony for her input and suggestions; the Bloomington Drosophila Stock Center (BDSC) for distributing the fly GFP-tagged TF strains; and the Caenorhabditis Genetics Center (CGC) for distributing the worm GFP-tagged TF strains. The BDSC and CGC are funded by the National Institutes of Health (NIH) Office of Research Infrastructure Programs P40 OD-018537 and P40 OD-010440, respectively. This work was supported by NIH grants U41-HG-007355 (R.H.W.) and R01-GM-076655 (S.E.C.), and a William Gates III Endowed Chair in Biomedical Sciences (R.H.W.).
© 2018 by the Genetics Society of America.
- Binding sites
- Caenorhabditis elegans
- Transcription factors