ONT R9.4.1 chemistry - DNA data
NA12878 PromethION data (~30X)
Complete Dataset (9.1M reads):
raw signal data:
basecalls:
Basecaller |
model |
SRA accession |
Guppy 4.0.11 |
dna_r9.4.1_450bps_hac_prom |
SRR15058167 |
Super accuracy basecalls from a recent Guppy version is available on a in-house NAS. Contact using Github issues if you want me to share them.
Subset (500,000 reads):
raw signal data:
basecalls:
Basecaller |
model |
SRA accession |
Guppy 4.0.11 |
dna_r9.4.1_450bps_hac_prom |
SRR15058164 |
Super accuracy basecalls from a recent Guppy version is available on a in-house NAS. Contact using Github issues if you want me to share them.
Following are datasets uploaded by others. Some of them have very old basecalls. In some of them, raw FAST5 signal files are no longer working with latest Guppy. I have converted them to BLOW5 format and have basecalled using a very recent Guppy through buttery-eel.
NA12878 MinION data from Nanopore WGS Consortium
- link: https://github.com/nanopore-wgs-consortium/NA12878/blob/master/Genome.md
- associated publication: https://www.nature.com/articles/nbt.4060
- raw signal data converted to BLOW5 format can be obtained from SRR23513620. filename:
na12878_DNA_blow5.tar
(md5sum: 2d02a7706d00572dcd9fcfa96e0357f4
)
- Guppy 6.1.3 super accuracy rebasecalled reads (through buttery-eel wrapper, with dna_r9.4.1_450bps_sup.cfg model) that passed the default qscore filter: SRR23513621.
- f5c 1.1 CpG methylation call frequencies in tsv and bigwig formats are available from https://doi.org/10.6084/m9.figshare.21543330.v1.
CHM13 from Telomere-to-telomere consortium project
- link: https://github.com/marbl/CHM13
- associated publication: https://www.science.org/doi/10.1126/science.abj6987
- raw signal data converted to BLOW5 format can be obtained from SRR23371619. file name:
CHM13_T2T_ONT_blow5.tar
(md5sum: 04f9d1c6ea2d11ccfc131c8244f059d3
).
- Guppy 6.3.7 high accuracy rebasecalled reads (through buttery-eel wrapper, with dna_r9.4.1_450bps_hac.cfg model) that passed the default qscore filter: SRR23365080.
- f5c 1.1 CpG methylation call frequencies in tsv and bigwig formats are available from https://doi.org/10.6084/m9.figshare.21520950.v2.
SARS-CoV-2 SP1
- link: https://community.artic.network/t/links-to-raw-fast5-fastq-data-for-artic-protocol/17
- associated publication: https://www.sciencedirect.com/science/article/pii/S1477893920300806
- raw signal data in BLOW5 format and the associated BLOW5 index:
- Guppy 6.1.3 high accuracy rebasecalled reads (through buttery-eel wrapper, with dna_r9.4.1_450bps_hac.cfg model) that passed the default qscore filter in FASTQ format : SP1-mapped_guppy_6.1.3_hac_pass.fastq.gz
Yeast (Saccharomyces cerevisiae)
- link: https://www.ncbi.nlm.nih.gov/bioproject/PRJNA510813
- associated Publication: https://genome.cshlp.org/content/29/8/1329
Chalmydomonas (Chlamydomonas reinhardtii)
- link: https://www.ncbi.nlm.nih.gov/sra/ERR3237140/
- associated associated publication: https://www.nature.com/articles/s41467-019-10168-2
Zymo data from LomanLab
- link: https://github.com/LomanLab/mockcommunity
- associated publication: https://academic.oup.com/gigascience/article/8/5/giz043/5486468
- raw signal data in BLOW5 format and the associated BLOW5 index for the Zymo-GridION-EVEN-BB-SN sample:
- Guppy 6.1.3 high accuracy rebasecalled reads (through buttery-eel wrapper, with dna_r9.4.1_450bps_hac.cfg model) that passed the default qscore filter FASTQ format for the Zymo-GridION-EVEN-BB-SN sample: Zymo-GridION-EVEN-BB-SN_guppy_6.1.3_hac_pass.gz