Skip to the content.

Summary of SLOW5 ASCII format

This is just a summary of the latest version of SLOW5 ASCII file format (.slow5). For the full specification and information on SLOW5 binary (called BLOW5) format, refer to the PDF links here.

A SLOW5 ASCII file is a plain text file that uses the American Standard Code for Information Interchange (ASCII) encoding (locale: C/POSIX, code set: US-ASCII). The file extension is .slow5. A SLOW5 file contains a header followed by the sequencing data. An example structure of a SLOW5 ASCII file with a single read group is and an example structure of a SLOW5 ASCII with multiple read groups - i.e., multiple sequencing runs - is provided below. The column/row borders, spacing and cell colours are added to increase the readability. The actual format uses tabs (‘\t’) and newlines (‘\n’) as delimiters

Example of a SLOW5 ASCII file with a single read group:

#slow5_version 1.0.0              
#num_read_groups 1              
@asic_id 0004A30B00232BEC              
@exp_start_time 2020-01-01T00:00:00Z              
@flow_cell_id FAH00000              
@run_id 855cdb              
             
#char* uint32_t double double double double uint64_t int16_t*
#read_id read_group digitisation offset range sampling_rate len_raw_signal raw_signal
read0 0 8192 6 1467.6 4000 123456 498,492,…
read1 0 8192 5 1467.6 4000 2000 491,491,…
readN 0 8192 3 1467.6 4000 3000 400,400,…

Example of a SLOW5 ASCII file with multiple read groups:

#slow5_version 1.0.0              
#num_read_groups 3              
@asic_id 0004A30B00232BEC 1004A30B00232BEC 2004A30B00232BEC          
@exp_start_time 2020-01-01T00:00:00Z 2020-01-01T00:00:00Z 2020-01-01T00:00:00Z          
@flow_cell_id FAH00000 FAH00001 FAH00002          
@run_id 855cdb 855cd1 855cdc          
           
#char* uint32_t double double double double uint64_t int16_t*
#read_id read_group digitisation offset range sampling_rate len_raw_signal raw_signal
read-0 1 8192 6 1467.6 4000 4000 498,492,…
read-1 0 8192 5 1467.6 4000 2000 491,491,…
read-N 2 8192 3 1467.6 4000 3000 400,400,…

SLOW5 Header

The SLOW5 header stores metadata regarding the experiment. Header lines start with either ‘#’ or ‘@’. The header contains two parts: the global header and the data header.

lines starting with ‘#’ form the global header.

Data header

The header lines that start with ‘@’ form the data header. These header lines contain ONT data attributes that are shared across multiple reads in a sequencing run (read group). For instance, the run_id and the flow_cell_id are common to all the reads in the read group and are therefore stored in the data header.

SLOW5 Data

After the SLOW5 header, the actual data is encoded. Each line contains information about a single read and we refer to this as a record.

Primary fields

These fields are mandatory and must be arranged in the order that they appear below:

Col Field name Data type Description Example value
1 read_id char* A unique identifier for the read. 00592138-f120-4ab5-9916-c5567adb8e29
2 read_group uint32_t Read group identifier. 0
3 digitisation double Number of quantisation levels in the Analog to Digital Converter (ADC). That is, if the ADC is 12 bit, digitisation is 4096 (212). 8192
4 offset double The ADC offset error. This value is added when converting the signal to pico ampere. 10
5 range double The full scale measurement range in pico amperes. 1441.389893
6 sampling_rate double Sampling frequency of the ADC, i.e., the number of data points collected per second. 4000
7 len_raw_signal uint64_t The number of samples in the raw signal (length of the raw_signal vector below). 59676
8 raw_signal int16_t* The raw signal which are the direct acquisition values from the ADC and are comma separated. 1039,588,588,593,586….

Primary fields contain all the information required for a typical nanopore signal-level analysis. The raw signal can be converted to pico-ampere using the following equation:

signal_in_pico_ampere = (raw_signal + offset) * range / digitisation

Auxiliary fields

These fields are optional and not bound by any strict order. Following are some common auxiliary data fields in SLOW5 format:

Field name Data type Description Example value
channel_number char* The channel number. A flow cell has multiple channels allowing multiple DNA/RNA strands to be sequenced in parallel. For instance, a MinION flow cell has 512 channels and thus can sequence 512 strands in parallel. 504
median_before double The estimated median current level immediately preceding the read. In most cases this can be used as an estimate of the open pore level. The open-pore state is when there is no strand inside the pore. 238.78225708007812
read_number int32_t A unique number within each channel counted upwards from zero. Note that not all reads generated are “strand” reads, but only strand reads are written to the final fast5 file, so some read numbers may be absent. 17981
start_mux uint8_t The MUX setting for the channel when the read began. Each channel contains one or more wells. For instance, a MinION flow cell has 4 wells per channel. The wells within a channel are connected to a multiplexer (MUX), a switch that controls which of the four wells in the channel is controlled and read out for sequencing. 4
start_time uint64_t The start time of the read. The unit for start_time is ‘number of signal samples’, so start_time has to be divided by sampling rate (sampling_rate) to get the start time in seconds (i.e. the time since the run was started) 335845487

Please cite the following in your publications when using S/BLOW5 file format:

Gamaarachchi, H., Samarakoon, H., Jenner, S.P. et al. Fast nanopore sequencing data analysis with SLOW5. Nat Biotechnol 40, 1026-1029 (2022). https://doi.org/10.1038/s41587-021-01147-4

@article{gamaarachchi2022fast,
  title={Fast nanopore sequencing data analysis with SLOW5},
  author={Gamaarachchi, Hasindu and Samarakoon, Hiruna and Jenner, Sasha P and Ferguson, James M and Amos, Timothy G and Hammond, Jillian M and Saadat, Hassaan and Smith, Martin A and Parameswaran, Sri and Deveson, Ira W},
  journal={Nature biotechnology},
  pages={1--4},
  year={2022},
  publisher={Nature Publishing Group}
}