An Encoding Scheme to Enlarge Practical DNA Storage Capacity by Reducing Primer-Payload Collisions

Yixun Wei, Bingzhe Li, David H.C. Du

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Scopus citations

Abstract

Deoxyribonucleic Acid (DNA), with its ultra-high storage density and long durability, is a promising long-term archival storage medium and is attracting much attention today. A DNA storage system encodes and stores digital data with synthetic DNA sequences and decodes DNA sequences back to digital data via sequencing. Many encoding schemes have been proposed to enlarge DNA storage capacity by increasing DNA encoding density. However, only increasing encoding density is insufficient because enhancing DNA storage capacity is a multifaceted problem.This paper assumes that random accesses are necessary for practical DNA archival storage. We identify all factors affecting DNA storage capacity under current technologies and systematically investigate the practical DNA storage capacity with several popular encoding schemes. The investigation result shows the collision between primers and DNA payload sequences is a major factor limiting DNA storage capacity. Based on this discovery, we designed a new encoding scheme called Collision Aware Code (CAC) to trade some encoding density for the reduction of primer-payload collisions. Compared with the best result among the five existing encoding schemes, CAC can extricate 120% more primers from collisions and increase the DNA tube capacity from 211.96 GB to 295.11 GB. Besides, we also evaluate CAC's recoverability from DNA storage errors. The result shows CAC is comparable to those of existing encoding schemes.

Original languageEnglish (US)
Title of host publicationSummer Cycle
PublisherAssociation for Computing Machinery
Pages71-84
Number of pages14
ISBN (Electronic)9798400703850
StatePublished - Apr 27 2024
Event29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS 2024 - San Diego, United States
Duration: Apr 27 2024May 1 2024

Publication series

NameInternational Conference on Architectural Support for Programming Languages and Operating Systems - ASPLOS
Volume2

Conference

Conference29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS 2024
Country/TerritoryUnited States
CitySan Diego
Period4/27/245/1/24

Bibliographical note

Publisher Copyright:
© 2024 Copyright is held by the owner/author(s). Publication rights licensed to ACM.

Keywords

  • DNA encoding scheme
  • DNA storage
  • primer-payload collision

Fingerprint

Dive into the research topics of 'An Encoding Scheme to Enlarge Practical DNA Storage Capacity by Reducing Primer-Payload Collisions'. Together they form a unique fingerprint.

Cite this