Cornetto is a method for iterative genome assembly using nanopore sequencing from Oxford Nanopore Technologies (ONT). This repository documents the Cornetto bioinformatics protocol and the Cornetto toolkit (a programme written in C and a collection of shell scripts). Cornetto toolkit also features some useful commands for evaluating assemblies generated from any other method.
Cornetto is under development and there could be interface changes and changes to default parameters. Do not hesitate to open an issue if you found a bug, something is not clear or for any feature requests.
Documentation: https://hasindu2008.github.io/cornetto
See here.
See here.
Using helper scripts:
scripts/telostats.sh asm.fasta # prints the telomere counts
scripts/minidotplot.sh ref.fasta asm.fasta # creates reference vs assembly dotplot. output: assembly.eps
scripts/asmstats.sh asm.fasta # prints chromosome-wise assembly to reference report
Using individual commands:
# creating a dot plot
## 1. align assembly to reference
minimap2 --eqx -cx asm5 ref.fasta asm.fasta > asm.paf
## 2. fix the +/- directions to match the reference
cornetto fixasm asm.fasta asm.paf -r asm.report.tsv -w asm.fix.paf > asm.fix.fasta
## 3. dot plot
cornetto minidot asm.fix.paf -f 2 > asm.eps
# per-chromosome evaluation. Note: asm.windows.0.4.50kb.ends.bed is from `scripts/telostats.sh`
cornetto asmstats asm.paf asm.windows.0.4.50kb.ends.bed -r asm.report.tsv -s ref.fasta
# miscellaneous commands
cornetto fa2bed asm.fasta > asm.bed # create a bed file with assembly contig lengths
cornetto seq -m 10000 reads.fastq > long.fastq # extract reads >=10kb
cornetto minidot
is simply minidot
from miniasm (MIT licensed).cornetto sdust
is simply sdust.cornetto telofind
, telobreaks
& telowin
are implemented based on VGP telomere scripts (BSD licensed).Regarding the name – the method is named by Ira Deveson after the ‘Cornetto’ ice-cream cone. There was a famous TV advertisement in Australia in 2000s that promoted the Cornetto as having ‘no boring bits’. In our cornetto workflow, we reject the ‘boring bits’, the bits which are too easy for the assemblers to resolve.