Endogenous retrotransposons have caused extensive genomic variation within mammalian species, but the functional implications of such mobilization are mostly unknown. We mapped thousands of endogenous retrovirus (ERV) germline integrants in highly divergent, previously unsequenced mouse lineages, facilitating a comparison of gene expression in the presence or absence of local insertions. Polymorphic ERVs occur relatively infrequently in gene introns and are particularly depleted from genes involved in embryogenesis or that are highly expressed in embryonic stem cells. Their genomic distribution implies ongoing negative selection due to deleterious effects on gene expression and function. A polymorphic, intronic ERV at Slc15a2 triggers up to 49-fold increases in premature transcriptional termination and up to 39-fold reductions in full-length transcripts in adult mouse tissues, thereby disrupting protein expression and functional activity. Prematurely truncated transcripts also occur at Polr1a, Spon1, and up to ̃5% of other genes when intronic ERV polymorphisms are present. Analysis of expression quantitative trait loci (eQTLs) in recombinant BxD mouse strains demonstrated very strong genetic associations between the polymorphic ERV in cis and disrupted transcript levels. Premature polyadenylation is triggered at genomic distances up to >12.5 kb upstream of the ERV, both in cis and between alleles. The parent of origin of the ERV is associated with variable expression of nonterminated transcripts and differential DNA methylation at its 5′-long terminal repeat. This study defines an unexpectedly strong functional impact of ERVs in disrupting gene transcription at a distance and demonstrates that ongoing retrotransposition can contribute significantly to natural phenotypic diversity.
Copyright 2012 Elsevier B.V., All rights reserved.