This pipeline worked for Streptocarpus rexiichloroplast and mitochondrial genome assembly with Oxford Nanopore Technologies (ONT) long-read sequencing datasets. The pipeline was also tested with Arabidopsis thaliana and Orysa sativa long-read sequencing datasets, and the chloroplast genomes of these model plants were successfully assembled.
This protocol shows the details of the PLCL chloroplast genome assembly using Arabidopsis thaliana “KBS-Mac-74” ONT reads (ERR2173373; Michael et al. 2018).
SRA Toolkit (https://trace.ncbi.nlm.nih.gov/Traces/sra)
NanoPlot (De Coster et al. 2018)
Quast (Gurevich et al. 2013)
Samtools (Li et al. 2009)
NCBI BLAST+ (https://blast.ncbi.nlm.nih.gov)
Perl Algorithm-KMeans-2.05 (https://metacpan.org/pod/Algorithm::KMeans)
Wtdbg2 (Ruan and Li 2020)
De Coster, W., D'Hert, S., Schultz, D.T., Cruts, M. and Van Broeckhoven, C. (2018) NanoPack: visualizing and processing long-read sequencing data. Bioinformatics, 34, 2666-2669. [NanoPlot]
Gurevich, A., Saveliev, V., Vyahhi, N. and Tesler, G.,(2013) QUAST: quality assessment tool for genome assemblies. Bioinformatics 29, 1072-1075. [quast]
Koren, S., Walenz, B.P., Berlin, K., Miller, J.R., Bergman, N.H. and Phillippy, A.M. (2017) Canu: scalable and accurate long-read assembly via adaptive k-mer weighting and repeat separation. Genome Res.27, 722-736. [canu]
Li, H. (2018) Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics34, 3094-3100. [minimap2]
Li, H., Handsaker, B., Wysoker, A., Fennell, T., Ruan, J., Homer, N., Marth, G., Abecasis, G., Durbin, R. and 1000 Genome Project Data Processing Subgroup (2009) The Sequence Alignment/Map format and SAMtools. Bioinformatics25, 2078-2079. [samtools]
Michael, T.P., Jupe, F., Bemm, F., Motley, S.T., Sandoval, J.P., Lanz, C., Loudet, O., Weigel, D. and Ecker, J.R. (2018) High contiguity Arabidopsis thaliana genome assembly with a single nanopore flow cell. Nat. Commun.9, 541.
Ruan, J. and Li, H. (2020) Fast and accurate long-read assembly with wtdbg2. Nat. Methods17, 155-158. [wtdbg2]
Wickham H. (2016) ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York. ISBN 978-3-319-24277-4, https://ggplot2.tidyverse.org. [ggplot2]