seq

ONT R9.4.1 chemistry - DNA data

NA12878 PromethION data (~30X)

Complete Dataset (9.1M reads):

raw signal data:

Description SRA accession Direct download link (md5sum)
BLOW5 format SRR22186402 na12878_prom_merged.blow5 (7e1a5900aff10e2cf1b97b8d3c6ecd1e), na12878_prom_merged.blow5.idx (a78919e8ac8639788942dbc3f1a2451a)
FAST5 format SRR15058166 fast5.tar.gz (0adbd2956a54528e92dd8fe6d42d2fce)

basecalls:

Basecaller model SRA accession
Guppy 4.0.11 dna_r9.4.1_450bps_hac_prom SRR15058167

Super accuracy basecalls from a recent Guppy version is available on a in-house NAS. Contact using Github issues if you want me to share them.

Subset (500,000 reads):

raw signal data:

Description SRA run Data access Direct download link (md5sum)
BLOW5 format SRR22186403 subsample_slow5.tar (6cdbe02c3844960bb13cf94b9c3173bb)
FAST5 format SRR15058164 subsample.tar.gz (591ec7d1a2c6d13f7183171be8d31fba)

basecalls:

Basecaller model SRA accession
Guppy 4.0.11 dna_r9.4.1_450bps_hac_prom SRR15058164

Super accuracy basecalls from a recent Guppy version is available on a in-house NAS. Contact using Github issues if you want me to share them.


Following are datasets uploaded by others. Some of them have very old basecalls. In some of them, raw FAST5 signal files are no longer working with latest Guppy. I have converted them to BLOW5 format and have basecalled using a very recent Guppy through buttery-eel.

NA12878 MinION data from Nanopore WGS Consortium

CHM13 from Telomere-to-telomere consortium project

SARS-CoV-2 SP1

Yeast (Saccharomyces cerevisiae)

Chalmydomonas (Chlamydomonas reinhardtii)

Zymo data from LomanLab