Abstract
Public gene sequence databases have become important research tools to understand viruses and other organisms. Evidence suggests that the identifying information for some of the sequences in these databases might not belong to the sequences they are associated with. We developed two tests to conduct a comprehensive analysis of all published sequences of the hemaglutinin and neuramidase genes of avian influenza viruses (AIVs) to identify sequences that may have been misclassified. One test identified sequence pairs with highly similar nucleotide sequences despite a difference of several years between their sampling dates. Another test, which was applied to samples sequenced and deposited more than once, detected sequences with more nucleotide differences to their own than to their closest relatives. All sequences identified as misclassified were further traced to relevant publications to assess the likelihood of contamination and determine if any conclusions were associated with the use of these sequences. Our results suggested that among 4040 published gene sequences examined, approximately 0.8% might be misclassified and that publications using these sequences may include inaccurate statements. Findings from this report suggest that using laboratory-adapted strains and handling multiple samples simultaneously increases the risk of contamination. The tests reported here may be useful for screening new submissions to public sequence databases.
| Original language | English (US) |
|---|---|
| Pages (from-to) | 29-36 |
| Number of pages | 8 |
| Journal | Genomics |
| Volume | 95 |
| Issue number | 1 |
| DOIs | |
| State | Published - Jan 2010 |
Bibliographical note
Funding Information:This work was funded by the Director for National Intelligence, Postdoctoral Research Fellowship Program (Washington, DC) .
UN SDGs
This output contributes to the following UN Sustainable Development Goals (SDGs)
-
SDG 3 Good Health and Well-being
Keywords
- Avian influenza virus
- Database
- Hemagglutinin
- Neuraminidase
- Sequence
Fingerprint
Dive into the research topics of 'Identifying errors in avian influenza virus gene sequences and implications for data usage of public databases'. Together they form a unique fingerprint.Cite this
- APA
- Standard
- Harvard
- Vancouver
- Author
- BIBTEX
- RIS