Haplotype-phased synthetic long reads from short-read sequencing

James A. Stapleton, Jeongwoon Kim, John P. Hamilton, Ming Wu, Luiz C. Irber, Rohan Maddamsetti, Bryan Briney, Linsey Newton, Dennis R. Burton, C. Titus Brown, Christina Chan, C. Robin Buell, Timothy A. Whitehead

    Research output: Research - peer-reviewArticle

    • 2 Citations

    Abstract

    Next-generation DNA sequencing has revolutionized the study of biology. However, the short read lengths of the dominant instruments complicate assembly of complex genomes and haplotype phasing of mixtures of similar sequences. Here we demonstrate a method to reconstruct the sequences of individual nucleic acid molecules up to 11.6 kilobases in length from short (150-bp) reads. We show that our method can construct 99.97%-accurate synthetic reads from bacterial, plant, and animal genomic samples, full-length mRNA sequences from human cancer cell lines, and individual HIV env gene variants from a mixture. The preparation of multiple samples can be multiplexed into a single tube, further reducing effort and cost relative to competing approaches. Our approach generates sequencing libraries in three days from less than one microgram of DNA in a single-tube format without custom equipment or specialized expertise.

    LanguageEnglish (US)
    Article numbere0147229
    JournalPLoS One
    Volume11
    Issue number1
    DOIs
    StatePublished - Jan 1 2016

    Profile

    Haplotypes
    haplotypes
    Genes
    DNA
    sampling
    methodology
    env Genes
    DNA Sequence Analysis
    Nucleic Acids
    HIV
    Genome
    Costs and Cost Analysis
    Cell Line
    Equipment and Supplies
    Messenger RNA
    Neoplasms
    Animals
    Cells
    Molecules
    Costs

    ASJC Scopus subject areas

    • Agricultural and Biological Sciences(all)
    • Biochemistry, Genetics and Molecular Biology(all)
    • Medicine(all)

    Cite this

    Stapleton, J. A., Kim, J., Hamilton, J. P., Wu, M., Irber, L. C., Maddamsetti, R., ... Whitehead, T. A. (2016). Haplotype-phased synthetic long reads from short-read sequencing. PLoS One, 11(1), [e0147229]. DOI: 10.1371/journal.pone.0147229

    Haplotype-phased synthetic long reads from short-read sequencing. / Stapleton, James A.; Kim, Jeongwoon; Hamilton, John P.; Wu, Ming; Irber, Luiz C.; Maddamsetti, Rohan; Briney, Bryan; Newton, Linsey; Burton, Dennis R.; Brown, C. Titus; Chan, Christina; Buell, C. Robin; Whitehead, Timothy A.

    In: PLoS One, Vol. 11, No. 1, e0147229, 01.01.2016.

    Research output: Research - peer-reviewArticle

    Stapleton, JA, Kim, J, Hamilton, JP, Wu, M, Irber, LC, Maddamsetti, R, Briney, B, Newton, L, Burton, DR, Brown, CT, Chan, C, Buell, CR & Whitehead, TA 2016, 'Haplotype-phased synthetic long reads from short-read sequencing' PLoS One, vol 11, no. 1, e0147229. DOI: 10.1371/journal.pone.0147229
    Stapleton JA, Kim J, Hamilton JP, Wu M, Irber LC, Maddamsetti R et al. Haplotype-phased synthetic long reads from short-read sequencing. PLoS One. 2016 Jan 1;11(1). e0147229. Available from, DOI: 10.1371/journal.pone.0147229
    Stapleton, James A. ; Kim, Jeongwoon ; Hamilton, John P. ; Wu, Ming ; Irber, Luiz C. ; Maddamsetti, Rohan ; Briney, Bryan ; Newton, Linsey ; Burton, Dennis R. ; Brown, C. Titus ; Chan, Christina ; Buell, C. Robin ; Whitehead, Timothy A./ Haplotype-phased synthetic long reads from short-read sequencing. In: PLoS One. 2016 ; Vol. 11, No. 1.
    @article{4becffc4a34e4d598822a92da20ed272,
    title = "Haplotype-phased synthetic long reads from short-read sequencing",
    abstract = "Next-generation DNA sequencing has revolutionized the study of biology. However, the short read lengths of the dominant instruments complicate assembly of complex genomes and haplotype phasing of mixtures of similar sequences. Here we demonstrate a method to reconstruct the sequences of individual nucleic acid molecules up to 11.6 kilobases in length from short (150-bp) reads. We show that our method can construct 99.97%-accurate synthetic reads from bacterial, plant, and animal genomic samples, full-length mRNA sequences from human cancer cell lines, and individual HIV env gene variants from a mixture. The preparation of multiple samples can be multiplexed into a single tube, further reducing effort and cost relative to competing approaches. Our approach generates sequencing libraries in three days from less than one microgram of DNA in a single-tube format without custom equipment or specialized expertise.",
    author = "Stapleton, {James A.} and Jeongwoon Kim and Hamilton, {John P.} and Ming Wu and Irber, {Luiz C.} and Rohan Maddamsetti and Bryan Briney and Linsey Newton and Burton, {Dennis R.} and Brown, {C. Titus} and Christina Chan and Buell, {C. Robin} and Whitehead, {Timothy A.}",
    year = "2016",
    month = "1",
    doi = "10.1371/journal.pone.0147229",
    volume = "11",
    journal = "PLoS One",
    issn = "1932-6203",
    publisher = "Public Library of Science",
    number = "1",

    }

    TY - JOUR

    T1 - Haplotype-phased synthetic long reads from short-read sequencing

    AU - Stapleton,James A.

    AU - Kim,Jeongwoon

    AU - Hamilton,John P.

    AU - Wu,Ming

    AU - Irber,Luiz C.

    AU - Maddamsetti,Rohan

    AU - Briney,Bryan

    AU - Newton,Linsey

    AU - Burton,Dennis R.

    AU - Brown,C. Titus

    AU - Chan,Christina

    AU - Buell,C. Robin

    AU - Whitehead,Timothy A.

    PY - 2016/1/1

    Y1 - 2016/1/1

    N2 - Next-generation DNA sequencing has revolutionized the study of biology. However, the short read lengths of the dominant instruments complicate assembly of complex genomes and haplotype phasing of mixtures of similar sequences. Here we demonstrate a method to reconstruct the sequences of individual nucleic acid molecules up to 11.6 kilobases in length from short (150-bp) reads. We show that our method can construct 99.97%-accurate synthetic reads from bacterial, plant, and animal genomic samples, full-length mRNA sequences from human cancer cell lines, and individual HIV env gene variants from a mixture. The preparation of multiple samples can be multiplexed into a single tube, further reducing effort and cost relative to competing approaches. Our approach generates sequencing libraries in three days from less than one microgram of DNA in a single-tube format without custom equipment or specialized expertise.

    AB - Next-generation DNA sequencing has revolutionized the study of biology. However, the short read lengths of the dominant instruments complicate assembly of complex genomes and haplotype phasing of mixtures of similar sequences. Here we demonstrate a method to reconstruct the sequences of individual nucleic acid molecules up to 11.6 kilobases in length from short (150-bp) reads. We show that our method can construct 99.97%-accurate synthetic reads from bacterial, plant, and animal genomic samples, full-length mRNA sequences from human cancer cell lines, and individual HIV env gene variants from a mixture. The preparation of multiple samples can be multiplexed into a single tube, further reducing effort and cost relative to competing approaches. Our approach generates sequencing libraries in three days from less than one microgram of DNA in a single-tube format without custom equipment or specialized expertise.

    UR - http://www.scopus.com/inward/record.url?scp=84958576755&partnerID=8YFLogxK

    UR - http://www.scopus.com/inward/citedby.url?scp=84958576755&partnerID=8YFLogxK

    U2 - 10.1371/journal.pone.0147229

    DO - 10.1371/journal.pone.0147229

    M3 - Article

    VL - 11

    JO - PLoS One

    T2 - PLoS One

    JF - PLoS One

    SN - 1932-6203

    IS - 1

    M1 - e0147229

    ER -