Primary sequence and epigenetic determinants of in vivo occupancy of genomic DNA by GATA1

Ying Zhang, Weisheng Wu, Yong Cheng, David C. King, Robert S. Harris, James Taylor, Francesca Chiaromonte, Ross C. Hardison

Research output: Contribution to journalArticlepeer-review

26 Scopus citations


DNA sequence motifs and epigenetic modifications contribute to specific binding by a transcription factor, but the extent to which each feature determines occupancy in vivo is poorly understood. We addressed this question in erythroid cells by identifying DNA segments occupied by GATA1 and measuring the level of trimethylation of histone H3 lysine 27 (H3K27me3) and monomethylation of H3 lysine 4 (H3K4me1) along a 66 Mb region of mouse chromosome 7. While 91% of the GATA1-occupied segments contain the consensus binding-site motif WGATAR, only ~0.7% of DNA segments with such a motif are occupied. Using a discriminative motif enumeration method, we identified additional motifs predictive of occupancy given the presence of WGATAR. The specific motif variant AGATAA and occurrence of multiple WGATAR motifs are both strong discriminators. Combining motifs to pair a WGATAR motif with a binding site motif for GATA1, EKLF or SP1 improves discriminative power. Epigenetic modifications are also strong determinants, with the factor-bound segments highly enriched for H3K4me1 and depleted of H3K27me3. Combining primary sequence and epigenetic determinants captures 52% of the GATA1-occupied DNA segments and substantially increases the specificity, to one out of seven segments with the required motif combination and epigenetic signals being bound.

Original languageEnglish (US)
Article numbergkp747
Pages (from-to)7024-7038
Number of pages15
JournalNucleic acids research
Issue number21
StatePublished - Sep 18 2009
Externally publishedYes

Bibliographical note

Funding Information:
National Institutes of Health (grant number R01 DK65806); Tobacco Settlement Funds from the Pennsylvania Department of Health; and the Huck Institutes of Life Sciences, Pennsylvania State University. Funding for open access charge: National Institutes of Health grant R01 DK65806.


Dive into the research topics of 'Primary sequence and epigenetic determinants of in vivo occupancy of genomic DNA by GATA1'. Together they form a unique fingerprint.

Cite this