Description
This data represents genotypes of a pepper diversity collection. Samples were sequenced using double-digest Genotyping-By-Sequencing (GBS) using Apek1 and Btg1 restriction enzymes. Fastq files were demultiplexed using Illumina bcl2fastq software (http://emea.support.illumina.com/downloads/bcl2fastq-conversion-software-v2-20.html). Trimmomatic was used to remove the first 12-bases (adapter sequences) from the beginning of each read (Bolger et al., 2014). Cleaned reads were aligned to the C. annuum reference genome (UCD-10X-F1; a cross between Criollos de Morelos 334 landrace and a non-pungent blocky pepper-breeding line; Hulse-Kemp et al. 2018) using BWA-mem (Li, 2013). Variants were called using Freebayes software to jointly call variants across all samples (Garrison & Marth, 2012). The initial VCF file was filtered using VCFtools to remove variants with minor allele frequency < 1%, variants with genotype rates < 95%, and samples with genotype rates < 10%. This generated a total of 22,916 SNPs across the 12 chromosomes. Further a subset of the lines were phenotyped for vitamin content, we provide these phenotypes as well as a marker set of 2966 markers that can be used to explore them.
Date made available | Oct 31 2022 |
---|---|
Publisher | ZENODO |