Genomic structural variants constitute the majority of variable base pairs in primate genomes and affect gene function in multiple ways. While whole gene duplications and deletions are relatively well-studied, the biology of subexonic (i.e., within coding exon sequences), copy number variation remains elusive. The salivary MUC7 gene provides an opportunity for studying such variation, as it harbors copy number variable subexonic repeat sequences that encode for densely O-glycosylated domains (PTS-repeats) with microbe-binding properties. To understand the evolution of this gene, we analyzed mammalian and primate genomes within a comparative framework. Our analyses revealed that (i) MUC7 has emerged in the placental mammal ancestor and rapidly gained multiple sites for O-glycosylation; (ii) MUC7 has retained its extracellular activity in saliva in placental mammals; (iii) the anti-fungal domain of the protein was remodified under positive selection in the primate lineage; and (iv) MUC7 PTS-repeats have evolved recurrently and under adaptive constraints. Our results establish MUC7 as a major player in salivary adaptation, likely as a response to diverse pathogenic exposure in primates. On a broader scale, our study highlights variable subexonic repeats as a primary source for modular evolutionary innovation that lead to rapid functional adaptation.
Bibliographical noteFunding Information:
This study is primarily funded by OG's start-up funds, as well as IMPACT grant from the University at Buffalo Research Foundation. P.P. was funded by grants FP7 REGPOT-InnovCrete (No. 316223) and FP7-PEOPLE-2013-IEF EVOGREN (625057). S.R. is funded by NIH grants R01DE019807 and R21DE025826.