TY - JOUR
T1 - Evaluating methods of updating training data in long-term genomewide selection
AU - Neyhart, Jeffrey L.
AU - Tiede, Tyler
AU - Lorenz, Aaron J.
AU - Smith, Kevin P.
N1 - Publisher Copyright:
© 2017 Neyhart et al.
PY - 2017/5/1
Y1 - 2017/5/1
N2 - Genomewide selection is hailed for its ability to facilitate greater genetic gains per unit time. Over breeding cycles, the requisite linkage disequilibrium (LD) between quantitative trait loci and markers is expected to change as a result of recombination, selection, and drift, leading to a decay in prediction accuracy. Previous research has identified the need to update the training population using data that may capture new LD generated over breeding cycles; however, optimal methods of updating have not been explored. In a barley (Hordeum vulgare L.) breeding simulation experiment, we examined prediction accuracy and response to selection when updating the training population each cycle with the best predicted lines, the worst predicted lines, both the best and worst predicted lines, random lines, criterion-selected lines, or no lines. In the short term, we found that updating with the best predicted lines or the best and worst predicted lines resulted in high prediction accuracy and genetic gain, but in the long term, all methods (besides not updating) performed similarly. We also examined the impact of including all data in the training population or only the most recent data. Though patterns among update methods were similar, using a smaller but more recent training population provided a slight advantage in prediction accuracy and genetic gain. In an actual breeding program, a breeder might desire to gather phenotypic data on lines predicted to be the best, perhaps to evaluate possible cultivars. Therefore, our results suggest that an optimal method of updating the training population is also very practical.
AB - Genomewide selection is hailed for its ability to facilitate greater genetic gains per unit time. Over breeding cycles, the requisite linkage disequilibrium (LD) between quantitative trait loci and markers is expected to change as a result of recombination, selection, and drift, leading to a decay in prediction accuracy. Previous research has identified the need to update the training population using data that may capture new LD generated over breeding cycles; however, optimal methods of updating have not been explored. In a barley (Hordeum vulgare L.) breeding simulation experiment, we examined prediction accuracy and response to selection when updating the training population each cycle with the best predicted lines, the worst predicted lines, both the best and worst predicted lines, random lines, criterion-selected lines, or no lines. In the short term, we found that updating with the best predicted lines or the best and worst predicted lines resulted in high prediction accuracy and genetic gain, but in the long term, all methods (besides not updating) performed similarly. We also examined the impact of including all data in the training population or only the most recent data. Though patterns among update methods were similar, using a smaller but more recent training population provided a slight advantage in prediction accuracy and genetic gain. In an actual breeding program, a breeder might desire to gather phenotypic data on lines predicted to be the best, perhaps to evaluate possible cultivars. Therefore, our results suggest that an optimal method of updating the training population is also very practical.
KW - Barley
KW - GenPred
KW - Genomic Selection
KW - Optimization
KW - Shared Data Resources
KW - Simulation
KW - Training population
UR - http://www.scopus.com/inward/record.url?scp=85019179674&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85019179674&partnerID=8YFLogxK
U2 - 10.1534/g3.117.040550
DO - 10.1534/g3.117.040550
M3 - Article
C2 - 28315831
AN - SCOPUS:85019179674
SN - 2160-1836
VL - 7
SP - 1499
EP - 1510
JO - G3: Genes, Genomes, Genetics
JF - G3: Genes, Genomes, Genetics
IS - 5
ER -