NCM17 – Day 2: Into the unknown

Day 2 at the NCM17 started, again, with Oxford Nanopore CEO Gordon Sanghera taking the stage, once again stating his dream that sequencing should be available to anybody, anytime, anywhere. There are currently around 12,500 mainframe DNA sequencers around the globe, a number he believes will be passed by ONT in a not too distant future. The availability of portable sequencers to the masses will change the way we sequence DNA (the sequencing singularity?), not to mention all the ethics involved in sequencing the genomes of everyone – a topic which was also discussed in a panel discussion. It will undoubtedly improve our understanding of diversity on our planet.

Arwyn Edwards employs Gordon’s motto in his work as he, in his own words, sequences in strange places. One example of such is a field trip to an old coal mine in Wales, where risks were high due to methane gasses. Although plugging in all equipment over ground to avoid sparks in the explosive environment, he still had a battery powered centrifuge blow up in a blue flame. Nevertheless, the team managed to get some reads out of the run, even though the input material was a mere 3×6 ng DNA! Other field trips have included Greenland to investigate microbial darkening and to Svalbard to monitor microbial populations in the melting Arctic ice. He has also been testing the new lyophilised reagents and found results coherent with other kits and Illumina sequencing. It is important to note however, that all methods have biases, all databases are incomplete and all extraction methods have problems. We need to address these issues before we can go out and sequence metagenomes in the field.

Continuing this subject was Mads Albertsen, my PI, who had a clear message of what we need – more genomes! There are some problems in recovering genomes from metagenomes though; Microdiversity in samples and separation of the genomes (binning). His approach is to use differential coverage, so binning can be performed by abundance. This is something that can, sometimes, be done with Illumina reads only, but hybrid assemblies with long Nanopore reads is simplifying this. Mads believes it won’t be long before we can beat evolution in speed and sequence everything! If you want to do metagenomes with differential coverage binning, check out our mmgenome R package.

It has been a great community meeting (my first!) with plenty of activity in the demonstration zone and flow cell loading workshop.

NCM17 – Day 1: New tech and exciting applications

All the gear

We start out with blazing punk rock (things are never too boring at Oxford Nanopore meetings) and plenty of promises from Oxford Nanopore CEO Gordon Sanghera. He gave us updates on various products, such as PromethION flow cells, which have been harder to produce than expected. However, we and others have now sequenced data on the PromethION! Yes, not the 50 gbp as Oxford Nanopore claim to be able to achive in-house. But we start to be much more optimistic than we have ever been! The VolTRAX should be working now and a second version was announced! The second version will have a fluorescent detector built in for quantification and it will be possible to load the prepped sample directly onto a flow cell. The second version will be released in the first half of 2018.

VolTRAX v. 2

The much-anticipated 128-channel Flongle will be available in the second half of 2018. We are looking forward to fast and cheap sequencing the field and laboratory – although a MinION flowcell is relatively cheap (approx 500 USD) it is still way too expensive for regular use in many applications where we do not need Gbps of data. Gordon Sanghera highlighted that they are currently also moving the flongle into the diagnostic field. The sky seems to be the limits for small, cheap and generic diagnostic kits.

Oxford Nanopore has realised that data yield on flow cells depends largely on extraction and library preparation techniques and have started focusing more on this. Automation of everything from extraction to library prep would be ideal, but no updates were given on Zumbador other than the fact that a lot of work remains to be done. Along these lines, Lyophilised reagents have been sent to a selected few for initial trials, and removing the cold chain will be a large step forward to enable sequencing anywhere. Furthermore the new mobile FPGA base-caller was shown and it will be up to the community to name the unit in a competition – use the #NanoporeAnyone tag! The SmidgION and the mobile basecaller was displayed in the demonstration area and look very fancy:

The talks

Nick Loman was up first and gave us an outlook on what we will be able to achieve with sequencing in the future. Based on the sequencing of Ebola in West Africa he showed a video of how the disease had spread and infected different regions. By using real-time genome sequencing we will be able to track diseases as they develop and although outbreaks will happen – we should be able to avoid epidemics.

This sequencing singularity, as he termed it, is however dependent on sequencing genomes instead of genes, as is standard today. Long reads will make this possible as they can often cover repeat locations in the genomes. The long reads will also mean that much less compute power is needed. An example was given with an E. coli sample, where an assembly was produced in miniasm in 1.5 seconds, producing a single contig with 8 reads (Now that Nick and his team lost the world record of the longest sequenced read, the still claims the world record of assembling E.coli with the fewest number of reads).

Sergey Koren continued in a related subject, talking about how we may be able to finally close the human genome. He showed very convincing results on integrating Hi-C with long-read Nanopore data, see their new tool SALSA and their related paper here. Combining Nanopore and Illumina reads in hybrid assemblies is the preferred way to get the desired accuracy and read length, which was also the theme of Ryan Wick’s talk. He brought up some shortcomings of assemblers and stated that all genomes are somewhat metagenomes because of variation. What should assemblers do when encountering these differences? There is a need for more variation-aware assemblers and more graph-friendly tools.

The diversity of life on Earth has always been tricky to study. Steven Salzberg addressed this in his talk about closing large genomes. Even with long reads you need bioinformagic to assemble the wheat genome, which is 16 Gbp long and full of repetitive transposon regions. The fact that it has taken 8-9 months of sequencing, 1 trillion bases sequenced on Illumina and 545 billion bases sequenced on Pacbio is a good indicator of how difficult it, still, is to close these genomes. And it is still not closed, but a lot closer to than before!

The evening speaker, Aaron Pomerantz, gave us a great presentation about his trip to the Ecuadorian rainforest with a mobile lab. The development of portable sequencing has the potential to give us bigger insights into the diversity in remote areas. Often samples have to be shipped halfway around the globe to be sequenced, adding further expenses and time to the workflow.

Nanopore Community Meeting 2017: Prologue

Oxford Nanopore is hosting their annual community meeting in New York City this week. I’ll be there, thanks to being in top 5 of their one more flowcell competition – thanks to all that voted!

Although not as “big” as the London Calling conference they host, the line-up of speakers is impressive and promises talks on everything from megabase reads and gigabase yields to the newest assembly algorithms, field applications and maybe even some product updates? I’m excited! And in this blog post, I will give an overview of what I expect to hear at the meeting.

The first plenary speaker is Nick Loman, who, until recently, had the boasting rights of the longest read on the MinION. The record is now with Kinghorn Genomics, but still sits below the megabase read.

Although megabase reads may not be practical or strictly necessary for most applications, the importance of read length is bound to be brought up by Loman, who will also speak about the future of sequencing – sequencing singularity, as he terms it. I am also looking forward to hearing Ryan Wick speak as he will undoubtedly tell about his work with Unicycler, which we recently described as our preferred hybrid genome assembler. Wick will give us an overview of what to improve in order to obtain perfect assemblies (In general you should check his awesome analysis and tools hosted on github, see e.g. his comparison on current basecalling algorithms for Oxford Nanopore data).

In general, the algorithm developers are very well representated with both Sergey Koren, developer of Canu, who will give an update on assembling the human genome with ultra-long reads and Steven Salzberg that will present an update on their hybrid assembly method for large genomes (MaSuRCA), which is bound to be interesting (you should also check out his blog).

The current record holder in data yield, Baptiste Mayjonade, has a talk on day 1. It will be interesting to hear updates on yields from both researchers and Oxford Nanopore, as a lot of us are still struggling to reach the 10 Gbp. Perhaps with the right methods we could all compete with Baptiste’s 15.7 Gbp on a MinION flowcell? We could also hope for some updates on PromethION flowcell yields, although these are still under heavy development (we have been running our own PromethION, and are now waiting for next flowcell shipment – a blog post should be in the works..).

The last speaker of day 1 is Aaron Pomerantz, who has been sequencing barcoded DNA in a rainforest in Ecuador. There is also Arwyn Edwards on day 2, who has been sequencing in extreme environments. As I am working with on-site DNA sequencing myself, I am very interested in hearing more about these projects. The general trend lately seems to be increased portability, which will undoubtedly spark a wide array of new exciting projects and give insights into the ecology of the Earth.

Another session in my interest is the “metagenomics, microbiomes and microbiology” session on day 2, where my PI, Mads Albertsen, is giving a talk on metagenomics in the long read era. I am currently moving into the metagenomics field and hope to gain some insight in this session.

With some Oxford Nanopore talks throughout the two days, I hope to get some updates on the new tech released. I’m particularly interested in the “flongle” (the small and dirt-cheap flowcell), which I have heard might be ready soon. Recently, we have seen tweets about the first version of the all-in-one Zumbador, new lyophilised reagents being tested and the first version of chemistry for the VolTRAX sample prep device.

I am sure I will be wiser by the end of next week. I will be giving some daily digests here, so stay tuned. If you are going to NCM17 yourself, please come talk to me by poster 18 on Thursday 10.50-11.15 AM and 3.25-3.50 PM, and I can give you an overview of how we recently sequenced activated sludge samples on-site at a wastewater treatment plant (and on the way home).