Haplotype-phased synthetic long reads from short-read sequencing

James A. Stapleton, Jeongwoon Kim, John P. Hamilton, Ming Wu, Luiz C. Irber, Rohan Maddamsetti, Bryan Briney, Linsey Newton, Dennis R. Burton, C. Titus Brown, Christina Chan, C. Robin Buell, Timothy A. Whitehead

Research output: Contribution to journalArticle

  • 5 Citations

Abstract

Next-generation DNA sequencing has revolutionized the study of biology. However, the short read lengths of the dominant instruments complicate assembly of complex genomes and haplotype phasing of mixtures of similar sequences. Here we demonstrate a method to reconstruct the sequences of individual nucleic acid molecules up to 11.6 kilobases in length from short (150-bp) reads. We show that our method can construct 99.97%-accurate synthetic reads from bacterial, plant, and animal genomic samples, full-length mRNA sequences from human cancer cell lines, and individual HIV env gene variants from a mixture. The preparation of multiple samples can be multiplexed into a single tube, further reducing effort and cost relative to competing approaches. Our approach generates sequencing libraries in three days from less than one microgram of DNA in a single-tube format without custom equipment or specialized expertise.

LanguageEnglish (US)
Article numbere0147229
JournalPLoS One
Volume11
Issue number1
DOIs
StatePublished - Jan 1 2016

Profile

Haplotypes
haplotypes
Genes
env Genes
DNA
DNA Sequence Analysis
Nucleic Acids
Animals
DNA libraries
Cells
HIV
Genome
Costs and Cost Analysis
Cell Line
Equipment and Supplies
Messenger RNA
nucleic acids
Molecules
sequence analysis
cell lines

ASJC Scopus subject areas

  • Agricultural and Biological Sciences(all)
  • Biochemistry, Genetics and Molecular Biology(all)
  • Medicine(all)

Cite this

Stapleton, J. A., Kim, J., Hamilton, J. P., Wu, M., Irber, L. C., Maddamsetti, R., ... Whitehead, T. A. (2016). Haplotype-phased synthetic long reads from short-read sequencing. PLoS One, 11(1), [e0147229]. DOI: 10.1371/journal.pone.0147229

Haplotype-phased synthetic long reads from short-read sequencing. / Stapleton, James A.; Kim, Jeongwoon; Hamilton, John P.; Wu, Ming; Irber, Luiz C.; Maddamsetti, Rohan; Briney, Bryan; Newton, Linsey; Burton, Dennis R.; Brown, C. Titus; Chan, Christina; Buell, C. Robin; Whitehead, Timothy A.

In: PLoS One, Vol. 11, No. 1, e0147229, 01.01.2016.

Research output: Contribution to journalArticle

Stapleton, JA, Kim, J, Hamilton, JP, Wu, M, Irber, LC, Maddamsetti, R, Briney, B, Newton, L, Burton, DR, Brown, CT, Chan, C, Buell, CR & Whitehead, TA 2016, 'Haplotype-phased synthetic long reads from short-read sequencing' PLoS One, vol 11, no. 1, e0147229. DOI: 10.1371/journal.pone.0147229
Stapleton JA, Kim J, Hamilton JP, Wu M, Irber LC, Maddamsetti R et al. Haplotype-phased synthetic long reads from short-read sequencing. PLoS One. 2016 Jan 1;11(1). e0147229. Available from, DOI: 10.1371/journal.pone.0147229
Stapleton, James A. ; Kim, Jeongwoon ; Hamilton, John P. ; Wu, Ming ; Irber, Luiz C. ; Maddamsetti, Rohan ; Briney, Bryan ; Newton, Linsey ; Burton, Dennis R. ; Brown, C. Titus ; Chan, Christina ; Buell, C. Robin ; Whitehead, Timothy A./ Haplotype-phased synthetic long reads from short-read sequencing. In: PLoS One. 2016 ; Vol. 11, No. 1.
@article{4becffc4a34e4d598822a92da20ed272,
title = "Haplotype-phased synthetic long reads from short-read sequencing",
abstract = "Next-generation DNA sequencing has revolutionized the study of biology. However, the short read lengths of the dominant instruments complicate assembly of complex genomes and haplotype phasing of mixtures of similar sequences. Here we demonstrate a method to reconstruct the sequences of individual nucleic acid molecules up to 11.6 kilobases in length from short (150-bp) reads. We show that our method can construct 99.97{\%}-accurate synthetic reads from bacterial, plant, and animal genomic samples, full-length mRNA sequences from human cancer cell lines, and individual HIV env gene variants from a mixture. The preparation of multiple samples can be multiplexed into a single tube, further reducing effort and cost relative to competing approaches. Our approach generates sequencing libraries in three days from less than one microgram of DNA in a single-tube format without custom equipment or specialized expertise.",
author = "Stapleton, {James A.} and Jeongwoon Kim and Hamilton, {John P.} and Ming Wu and Irber, {Luiz C.} and Rohan Maddamsetti and Bryan Briney and Linsey Newton and Burton, {Dennis R.} and Brown, {C. Titus} and Christina Chan and Buell, {C. Robin} and Whitehead, {Timothy A.}",
year = "2016",
month = "1",
day = "1",
doi = "10.1371/journal.pone.0147229",
language = "English (US)",
volume = "11",
journal = "PLoS One",
issn = "1932-6203",
publisher = "Public Library of Science",
number = "1",

}

TY - JOUR

T1 - Haplotype-phased synthetic long reads from short-read sequencing

AU - Stapleton,James A.

AU - Kim,Jeongwoon

AU - Hamilton,John P.

AU - Wu,Ming

AU - Irber,Luiz C.

AU - Maddamsetti,Rohan

AU - Briney,Bryan

AU - Newton,Linsey

AU - Burton,Dennis R.

AU - Brown,C. Titus

AU - Chan,Christina

AU - Buell,C. Robin

AU - Whitehead,Timothy A.

PY - 2016/1/1

Y1 - 2016/1/1

N2 - Next-generation DNA sequencing has revolutionized the study of biology. However, the short read lengths of the dominant instruments complicate assembly of complex genomes and haplotype phasing of mixtures of similar sequences. Here we demonstrate a method to reconstruct the sequences of individual nucleic acid molecules up to 11.6 kilobases in length from short (150-bp) reads. We show that our method can construct 99.97%-accurate synthetic reads from bacterial, plant, and animal genomic samples, full-length mRNA sequences from human cancer cell lines, and individual HIV env gene variants from a mixture. The preparation of multiple samples can be multiplexed into a single tube, further reducing effort and cost relative to competing approaches. Our approach generates sequencing libraries in three days from less than one microgram of DNA in a single-tube format without custom equipment or specialized expertise.

AB - Next-generation DNA sequencing has revolutionized the study of biology. However, the short read lengths of the dominant instruments complicate assembly of complex genomes and haplotype phasing of mixtures of similar sequences. Here we demonstrate a method to reconstruct the sequences of individual nucleic acid molecules up to 11.6 kilobases in length from short (150-bp) reads. We show that our method can construct 99.97%-accurate synthetic reads from bacterial, plant, and animal genomic samples, full-length mRNA sequences from human cancer cell lines, and individual HIV env gene variants from a mixture. The preparation of multiple samples can be multiplexed into a single tube, further reducing effort and cost relative to competing approaches. Our approach generates sequencing libraries in three days from less than one microgram of DNA in a single-tube format without custom equipment or specialized expertise.

UR - http://www.scopus.com/inward/record.url?scp=84958576755&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84958576755&partnerID=8YFLogxK

U2 - 10.1371/journal.pone.0147229

DO - 10.1371/journal.pone.0147229

M3 - Article

VL - 11

JO - PLoS One

T2 - PLoS One

JF - PLoS One

SN - 1932-6203

IS - 1

M1 - e0147229

ER -