Since November we have run 11 PromethION flow cells, generating >900 Gbp of data. It has been a journey of highs and lows, so we want to share what we have learned along the way.
Our first lesson: it is important to use those flow cells ASAP! Despite our lab’s earlier success with a three month old PromethION flow cell, three flow cells out of our first shipment died after five weeks of storage in the fridge, and two had problems with leaking priming solution. After Oxford Nanopore sent a replacement box we were back on track to meet our end of year sequencing goal.
The second lesson: flow cell QC can be a bit hit and miss. We found that estimations of active pore numbers varied by up to 3000 when conducting repeated QCs, even within 20 minutes of each other. For example, one flow cell had just over 5000 active pores on arrival, but when QC’d before sequencing had 7795 (YES!), and then following the first MUX scan had 5106 active pores (NO!) at the start of sequencing. Our currently running final flow cell had a count of 5613 on arrival, 4339 two weeks later, 5030 in the check an hour before the run, and 7513 active pores at the start of sequencing! We determined that a best of three (or five) approach to QC is appropriate.
Our quality DNA was size selected using the Nanopore Community SPRI protocol (adapted from this Schalamun and Schwessinger protocol), which noticeably improved the DNA profile by removing most fragments <1.5 kb. We produced 11 barcoded pools, containing three samples each, using the one pot ligation and native barcoding protocol adapted for the SQK-LSK109 PromethION library preparation kit.
The number of active pores at the start of sequencing was a good indicator of how much data we would get. Our best flow cell of 8370 active pores generated a massive 144 Gbp of data!
One software hiccup caused a currently sequencing run to stop when we started the QC for another flow cell – we are not sure what caused this, and it hasn’t happened since.
After basecalling using Albacore and demultiplexing using Porechop, we learned our third lesson: not all barcodes are created equal. While looking at our first eight flow cells we found that two barcodes consistently performed poorly compared to the others in their pools. Barcode 3 and Barcode 7 accounted for only 3-5% and 13-19% of the data yield, respectively, on three different flow cells. We had attempted to produce equimolar pools aiming for an ideal yield of 33% of data for each barcoded sample. Consequently, we needed to do a few additional runs to catch up on some samples (to meet our aim of >20 Gbp per sample).
Of all the runs so far we have a data yield between 57.55 Gbp and 144.54 Gbp per sample, with a mean read length of 5145 – 6616 bp, and a median of 3527 – 5586 bp. We have one more pool currently sequencing (fingers crossed for 95 Gbp!), then the machine will have a well-deserved holiday.
Figures of run overviews were created using MinIONQC
R Lanfear, M Schalamun, D Kainer, W Wang, B Schwessinger (2018). MinIONQC: fast and simple quality control for MinION sequencing data, Bioinformatics, bty654 https://doi.org/10.1093/bioinformatics/bty654
Latest posts by Caitlin Singleton (see all)
- All I want for Christmas is a terabase of Nanopore data - December 21, 2018