Environmentally sensitive plant gene families like NBS-LRRs, receptor kinases, defensins and others, are known to be highly variable. However, most existing strategies for discovering and describing structural variation in complex gene families provide incomplete and imperfect results. The move to de novo genome assemblies for multiple accessions or individuals within a species is enabling more comprehensive and accurate insights about gene family variation. Earlier array-based genome hybridization and sequence-based read mapping methods were limited by their reliance on a reference genome and by misplacement of paralogous sequences. Variant discovery based on de novo genome assemblies overcome the problems arising from a reference genome and reduce sequence misplacement. As de novo genome sequencing moves to the use of longer reads, artifacts will be minimized, intact tandem gene clusters will be constructed accurately, and insights into rapid evolution will become feasible.
Bibliographical noteFunding Information:
We thank Jason Miller, Joann Mudge, and Peter Tiffin for useful discussions. We regret that it was not possible to cite the works of many additional scientists who contributed to this field as a result of space limitations. This work was supported, in part, by NSF grant DBI-1237993 .