Heuristic and Hierarchical-Based Population Mining of Salmonella enterica Lineage I Pan-Genomes as a Platform to Enhance Food Safety

Joao Carlos Gomes-Neto, Natasha Pavlovikj, Carmen Cano, Baha Abdalhamid, Gabriel Asad Al-Ghalith, John Dustin Loy, Dan Knights, Peter C. Iwen, Byron D. Chaves, Andrew K. Benson

Research output: Contribution to journalArticlepeer-review

1 Scopus citations

Abstract

The recent incorporation of bacterial whole-genome sequencing (WGS) into Public Health laboratories has enhanced foodborne outbreak detection and source attribution. As a result, large volumes of publicly available datasets can be used to study the biology of foodborne pathogen populations at an unprecedented scale. To demonstrate the application of a heuristic and agnostic hierarchical population structure guided pan-genome enrichment analysis (PANGEA), we used populations of S. enterica lineage I to achieve two main objectives: (i) show how hierarchical population inquiry at different scales of resolution can enhance ecological and epidemiological inquiries; and (ii) identify population-specific inferable traits that could provide selective advantages in food production environments. Publicly available WGS data were obtained from NCBI database for three serovars of Salmonella enterica subsp. enterica lineage I (S. Typhimurium, S. Newport, and S. Infantis). Using the hierarchical genotypic classifications (Serovar, BAPS1, ST, cgMLST), datasets from each of the three serovars showed varying degrees of clonal structuring. When the accessory genome (PANGEA) was mapped onto these hierarchical structures, accessory loci could be linked with specific genotypes. A large heavy-metal resistance mobile element was found in the Monophasic ST34 lineage of S. Typhimurium, and laboratory testing showed that Monophasic isolates have on average a higher degree of copper resistance than the Biphasic ones. In S. Newport, an extra sugE gene copy was found among most isolates of the ST45 lineage, and laboratory testing of multiple isolates confirmed that isolates of S. Newport ST45 were on average less sensitive to the disinfectant cetylpyridimium chloride than non-ST45 isolates. Lastly, data-mining of the accessory genomic content of S. Infantis revealed two cryptic Ecotypes with distinct accessory genomic content and distinct ecological patterns. Poultry appears to be the major reservoir for Ecotype 1, and temporal analysis further suggested a recent ecological succession, with Ecotype 2 apparently being displaced by Ecotype 1. Altogether, the use of a heuristic hierarchical-based population structure analysis that includes bacterial pan-genomes (core and accessory genomes) can (1) improve genomic resolution for mapping populations and accessing epidemiological patterns; and (2) define lineage-specific informative loci that may be associated with survival in the food chain.

Original languageEnglish (US)
Article number725791
JournalFrontiers in Sustainable Food Systems
Volume5
DOIs
StatePublished - Oct 1 2021

Bibliographical note

Funding Information:
This research could only be completed by utilizing the Holland Computing Center (HCC) at UNL, which receives support from the Nebraska Research Initiative. We are also thankful for having access, through the HCC, to resources provided by the Open Science Grid (OSG), which is supported by the National Science Foundation and the U.S. Department of Energy's Office of Science. This work used the Pegasus Workflow Management Software which is funded by the National Science Foundation (grant #1664162). We would like to once again express our gratitude to Mats Rynge for helping us run ProkEvo on OSG. We also thank Dr. Derek Weitzel and Karan Vahi for their continual technical computational support, and Dr. Peter Evans from USDA-FSIS for his suggestions on data presentation and interpretation. This paper is dedicated to the memory of Dr. David Swanson, who was the director of the HCC at our institution, and has sadly passed away recently. Dr. David Swanson was kind and sincere person, who was an inspiration to us all through his fantastic work in setting the path for scalable and parallel computing in our institution and beyond. Unfortunately, he could not see the completion of this work, but without him we would not have been able to reach our goals.

Funding Information:
This work was supported by funding provided by the UNL-IANR Agricultural Research Division and the National Institute for Antimicrobial Resistance Research and Education, the Nebraska Food for Health Center at the Food Science and Technology Department (UNL), and by the University of Nebraska Foundation (Layman Award).

Funding Information:
This research could only be completed by utilizing the Holland Computing Center (HCC) at UNL, which receives support from the Nebraska Research Initiative. We are also thankful for having access, through the HCC, to resources provided by the Open Science Grid (OSG), which is supported by the National Science Foundation and the U.S. Department of Energy’s Office of Science. This work used the Pegasus Workflow Management Software which is funded by the National Science Foundation (grant #1664162). We would like to once again express our gratitude to Mats Rynge for helping us run ProkEvo on OSG. We also thank Dr. Derek Weitzel and Karan Vahi for their continual technical computational support, and Dr. Peter Evans from USDA-FSIS for his suggestions on data presentation and interpretation. This paper is dedicated to the memory of Dr. David Swanson, who was the director of the HCC at our institution, and has sadly passed away recently. Dr. David Swanson was kind and sincere person, who was an inspiration to us all through his fantastic work in setting the

Publisher Copyright:
© Copyright © 2021 Gomes-Neto, Pavlovikj, Cano, Abdalhamid, Al-Ghalith, Loy, Knights, Iwen, Chaves and Benson.

Keywords

  • Salmonella enterica
  • food safety
  • foodborne pathogens
  • pan-genome
  • population genomics
  • whole-genome sequencing

Fingerprint

Dive into the research topics of 'Heuristic and Hierarchical-Based Population Mining of Salmonella enterica Lineage I Pan-Genomes as a Platform to Enhance Food Safety'. Together they form a unique fingerprint.

Cite this