Following our PromethION unboxing event we have finally managed to connect the machine to our University network and pass the configuration run. The configuration test demonstrates that the network can handle the data produced and transfer it fast enough to a network storage solution.
— Rasmus Kirkegaard (@kirk3gaard) January 26, 2017
After a successful configuration test (millions of fast5 files generated) we contacted Oxford Nanopore and they organised to pick up the configuration unit.
— Rasmus Kirkegaard (@kirk3gaard) February 9, 2017
So now we just have to wait for the sequencing unit which they mentioned should be available around the end of February (Pretty soon!). Maybe the CTO is building it himself this very moment.
still building and shipping em – more to follow pic.twitter.com/F38RbGZyGp
— Clive G. Brown (@Clive_G_Brown) February 9, 2017
The hardware installation was super-easy, but the network configuration was a challenge. The PromethION settings are controlled through a web-interface, which seem to run some configuration scripts whenever we have changed something and submitted the change. However, it is not completely transparent what is going on and handling the PromethION as a “black” box system with multiple network interfaces was not exactly straightforward. Furthermore, our University runs what they call an “enterprise configuration”. This essentially means that we are not allowed to play with settings, and in order to ensure maintenance of the entire network is feasible, we cannot make too many “quick and dirty” workarounds.
Hence, our local IT-experts have been essential in getting the PromethION up and running!
I would highly recommend that you get your hands on a “linux ninja” and a true network expert if you risk running into non standard configurations (My personal guess is that it would probably have worked fairly straightforward if we would have plugged it into a standard off the shelf router with default configuration).
Our IT guys came up with the following wishes for improving the network setup:
“For faster understanding of the device, it would be nice with a network drawing, and maybe a few lines on how it is designed to work (for a Sysadmin or Network admin)
The possibility to use static address for everything should be possible.
It would be really nice if you could tell the setup script a start or end IP. We had to reconfigure our network because setup insisted on using, the top 2 addresses in the /24 network we had assigned to it (.253 and .254) which are used for HRSP in our network. We had to remove HSRP, on our 10G network and enable DHCP on the local server network (Ethernet), which is not desirable.
Having a “management” interface on a DHCP assigned address, is a hassle. Allow static addresses, so its easy to use DNS names in stead of guessing IP´s
Allow configuring static DNS.
The test device came with static defined routes
192.168.0.0 (Cant remember subnet on this one)
Both are problematic, as they are WAY bigger than they need to.
Especially 172.16.0.0 is problematic for our network, as our DNS servers are on 172.18.x.y
I did not find any good reason for 172.16 anywhere.
192.168 I am guessing is there for the Default setup. That should in my point of view be limited to 22.214.171.124 as this is the actual range used. Static routes can be needed, but should be kept as small as possible”
As a student, you will have to present sometime during your education. Despite this, there is hardly any time allocated to learning the skills required in giving a good presentation .
As part of your Masters degree at Aalborg University you’ll have to participate in at least one status seminar presenting your thesis (20 minutes). Afterwards there is a 5 minute time slot for questions from the audience. The audience will be your fellow students, your supervisor(s) and other students or employees who may be interested in your project content.
My fellow (Albertsen Lab) master students and I, spend approximately two weeks preparing for this. During this period, it became clear that the amount of guidance we got was pretty unusual. Hence, I thought I would share how we prepared and the differences it made in general and specifically to our slides.
- 19th: Meeting, brainstorming about content of presentation
- 20th: Sending the first draft of the presentation and receiving feedback.
- 26th: Rehearsal of presentations. Each student within our group presented and we were constructively critiqued by others in the group regarding slide content and presentation skills.
- 27th: Improved slideshow was sent once again and feedback was given for the final time.
- 31st: Status seminar
Although it seems to be rather extensive, I feel all of our presentations benefited from the extra effort.
Example from Peters presentation
Before: Peter wanted to illustrate how he had optimized the method.
After: A line-up of conditions before and after optimization.
Example from Kaspers presentation
Before: Kasper wanted to illustrate how your ordination plot can change depending on your choice of distance metric.
After: Kasper added a progress bar (with neutral colors), found an example to better illustrate his point, added the citation and underlined his point with big red statement.
Example from my presentation (1)
Before: I wanted to show the current status of my network function.
After: I changed some visual properties in my tools for better visualization. I also changed the specific OTU name to example names, as my audience could not relate to the MiDAS data base.
Example from my presentation (2)
Before: I wanted to make a quick introduction to correlation
After: Removing text for simplification and adding citation.
What you cannot see from the examples, is the improvement in the delivery of our presentations. As a student it can be nerve-racking to present science in front of an audience. If you haven’t had feedback, that is just one more thing to be nervous about. Getting feedback both on my slides and my way of presenting them gave me the safety of proper preparation.
After this experience, I can’t help but feeling thankful that learning to present is of high priority in our group. It is key to be able to communicate your message clearly, especially in a scientific community. It is not a part of our curriculum and maybe it is too much to expect, that students can learn to master this without any guidance.
Recently, I was one of 16 recipients of a 10 mill. DKK grant (1.3 mill. EUR) from the VILLUM foundation under their Young Investigator Program (YIP). The program is unique in Denmark and offers young scientists an opportunity to build a research group on their own terms. The foundation is working on the premise of the founder of who famously said:
“One experiment is better than a thousand expert opinions”
Villum Kann Rasmussen
Hence, they simply support good experiments and trust that the researchers will come up with great solutions, if the foundation interfere as little as possible. This means as little as possible administration and flexible funding if new opportunities arise during the project. While this sounds almost too good to be true, previous grantees have all said that it actually works this way!
So, how do I plan to spend 10 mill DKK (1.5 mill. EUR)?
Microbial communities underpin all processes in the environment and have direct impact on human health. Despite their importance, only a tiny fraction of the millions of different microbes is known. This is mainly due to the immense difficulties of cultivating microbes from natural systems in the laboratory. This discrepancy is also known as the “microbial dark matter”.
For any microbe, the genome is the blueprint of its physiological properties. Having this in hand, it is possible to reconstruct its potential metabolism and establish hypotheses for evolution, function and ecology. Furthermore, it provides a foundation for further validating its function through a variety of in situ methods. However, genomes are extremely difficult to obtain from the microbial dark matter.
Currently, multiple metagenomes combined with bioinformatic approaches, is used to retrieve individual genomes from complex samples (see e.g. our paper from 2013). This has let to numerous fundamental discoveries, including the discovery of bacteria cable of complete ammonia oxidation (Comammox, see here and here), which radically change our view of the global nitrogen cycle and granted us the “Danish research result of the year, 2015”.
However, we are still far from realizing the full potential of metagenomics to retrieve genomes. Mainly due to the complexity of nature, where multiple closely related strains co-exists, which renders the current approaches useless.
Using the VILLUM YIP grant we want to use cutting-edge DNA sequencing related techniques to enable access to all genomes despite strain-complexity, link genomes, plasmids and phages, and enable direct measurements of in situ bacterial activity. The ability to readily obtain activity measurements of any bacteria, in any microbial ecosystem, will radically change microbial ecology and environmental biotechnology.
Obtaining complete bacterial genomes
Retrieving individual bacterial genomes from complex microbial communities can be compared to mixing hundreds of puzzles with millions of pieces, all containing different shades of blue sky. However, one way to circumvent the problem with closely related strains, is to use bigger pieces of DNA to assemble the genomes. The current standard approach is to use short read sequencing (Illumina; approx. 2 x 250 bp). However, the rapid development within long-read DNA sequencing, means that it is possible to start to experiment and envision how this is going to be solved.
The newest technology to the long read market is the Oxford Nanopore. It has successfully been used to generate complete genomes from pure cultures and we have used it for metagenomics of simple enrichment reactors to obtain the first complete Comammox genome. We have been early access testers of the MinION and are currently involved in the developer program. The improvement of the technology that has happened in the first half of 2016, means that the quality and throughput of the technology are now sufficient to attempt medium complexity metagenomes. Furthermore, we are one of the early customers to the high-throughput version of the MinION, the PromethION, which, in theory, would allow us to tackle even complex metagenomes.
Furthermore, while long-read DNA sequencing might enable closed bacterial chromosomes, they are still not associated directly with e.g. plasmids and phages. However, the last couple of years the several new methods have appeared, e.g. Hi-C and 3C, that utilize physical cross-linking of the DNA inside cells to generate sequencing libraries where proximity information can be retrieved. This information can then be used to infer which genetic elements were in close proximity, and thereby originated from the same bacterial cell. However, until now, the methods have only been used in microbial communities of limited complexity, but there is does not seem to be theoretical limits that would hinder the use of the methods, if complete genomes are available.
Measuring in situ replication rates using DNA sequencing
An exciting new possibility is that complete genomes enable measurements of bacterial replication rates directly from metagenomic data (see here and here). The method is very simple and based on the fact that the majority of bacteria starts replication at a single origin and then proceeds bi-directional. Hence, in an active dividing bacterial population there will be more DNA at the origin of replication than at the termini. This can be directly measured using DNA sequencing, as the number of reads is proportional to the amount of DNA in the original sample. Hence, by comparing the number of reads (coverage) at the origin to the termini, a measure of bacterial replication rate is obtained. This allows direct observations of individual bacterial response to stimuli in the environment, even in very complex environments as e.g. the human gut and with sub-hourly resolution. This type of information has been the dream of microbial ecologists since the field emerged over 100 years ago and will allow for countless new experiments within microbial ecology. Recently, the method has even been demonstrated to work with high-quality metagenome bins (see here). It is going to be interesting to further explore the potentials and limitations of the method using complete genomes at an unprecedented scale.
A few closing remarks
I am thrilled to have the next five years to explore how we can apply new DNA sequencing methods to understand the bacterial world and have the chance to build up a group of young scientists that share my excitement! If you think the project sounds great and either want to collaborate or work with us – then drop me an email!
Finally, I have to thank the people and mentors that made this possible. First of all my long-term mentor Per H. Nielsen; 6 years ago he introduced me to the world of microbial communities and throughput the years he have given me the freedom to pursue my own ideas – “freedom with responsibility” as we say in Danish. A leadership style that I very much try to adapt in my own newly found role as group leader. Secondly, my colleagues and friends Søren M. Karst and Rasmus H. Kirkegaard, whom I have persuaded to join me on further adventures down the rabbit hole! Furthermore, the long list of collaborators over the past years, where I have been fortunate to learn from some best scientists in the world (if you ask me). There are too many to mention, but a special thank goes out to Holger Daims, Michael Wagner, Gene Tyson and Phil Hugenholtz.
As a newly started master student it can be a bit intimidating to attend a conference. Nevertheless, I said yes when I was offered to attend MEWE16 in Copenhagen. I had just ended a one-year maternity leave and was eager to start my thesis and talk to adult people.
I quickly realized that I was surrounded by experts. I recognized names from articles, got a bit starstruck and started thinking: How in the … am I supposed to learn anything from this? They are so much smarter than me.
However, as the days past I realized there is a lot to take home from such an event – even for a small fry master student. So, I compiled a list:
- Watch and learn. If you can follow the science – great! If you can’t, look for other things. In what way, did he/she present data? What worked? What did not? Did someone use other means to get their message through? One thing that I found could really keep my attention was humor. Not like comedy – a little goes a long way – but when my mind started to wander, humor redirected my attention back to the talk.
- Poster sessions. I was surprised that the poster session was so crowded and it quickly became awkward because I was unprepared. So, take some time to browse the posters and note if anyone is of particular interest. Should there be no one of interest, use it as practice in “talking science” to people you don’t know. A suggestion could be: 1) go to a poster 2) ask: “Can you tell me about your poster?” At least for me, it is very good practice.
- Networking can be many things. Don’t feel like you have to network with all the big guys, just because you’re in the same room. I am lucky enough to belong to a group that treasures social gatherings and I spent most of my time building my “local network”.
- Be nice to yourself. I think you should be fair to yourself and realize that very few talks are directed toward you. It is a crowd of experts and if you want to have a relevant debate on your topic, maybe you have to narrow your talk and that’s not always for small fry. This is an opportunity to get some fresh air, see the city or browse the posters as mentioned earlier.
Following our experiences with DNA sequencing using the MinION since 2014 as a part of the Minion (early) Access Programme (MAP), and their developers programme we applied for a spot on the PromethION Early Access Programme (PEAP) back in May 2015. The MinION was the mindblowing DNA sequencer that allows you to do long read (no fixed limit) DNA sequencing by plugging it into a laptop!!! It was an absolutely amazing piece of tech, but the initial throughput was not enough for our aim of retrieving the complete (and closed) genomes from all the abundant organisms in complex samples such as wastewater treatment systems. The PromethION promised a solution to this lack of throughput by having 48 times more flow-cells with 6 times more pores in each cell.
As we were waiting for the Promethion, we used the MinION frequently and our first try at a metagenome sample was a simple two species culture where we used the long reads to scaffold the Nitrospira genome and thus helped show that all the genes neeeded for complete nitrification were present in a single organism (Comammox). At the time, we could scaffold the illumina based assembly with some nanopore reads, but since then ONT has improved their technology tremendously and people have started to get data in the ~5 Gbp range from a single flowcell.
Hence, back-of-the-envelope calculations says that without any further improvements the PromethION would now be able to generate:
[5Gbp pr. flowcell] * [6 x number of pores] * [48 flowcells] = 1440 Gbp (in just ~48hrs)
In other words equivalent to 288.000X coverage of a microbial genome of 5 Mbp (1440000 Mbp/5 Mbp). If we want to retrive genomes of organisms at 0.1% abundance that would still amount to 288 X coverage! While we expected improvements in throughput, we never foresaw that it would come this quick and then suddenly the day came where our Promethion configuration unit arrived. The unit was delivered by ONT in a small van and we had a nice little unboxing experience. The Nanopore hype have finally reached the entire department that have started dreaming about applications for long read sequencing.
— Rasmus Kirkegaard (@kirk3gaard) January 12, 2017
As the PromethION is expected to produce massive amounts of data in very little time the need for fast data transport and storage is another challenge. Even storing data for a single MinION is causing trouble for people.
Oh whoops. 500 gig ssd filled in less than 12 hours by 'disappointing' @nanopore run.
— mattloose (@mattloose) January 19, 2017
ONT therefore ships a PromethION configuration unit to test whether the local infrastructure is ready before shipping the actual PromethION. The accompanying manual states that the maximum expected signal data output would be 80GB/hr per flowcell. The spec sheet for a NAS server suggested by ONT to move the data away from the PromethION itself, while running the sequencing, includes 2 fibre connections and 12*6 TB SSDs to support the internal buffer of 24 TB SDD storage on the PromethION. This amount of SSD storage at enterprise quality does not come cheap and only covers a machine for temporary storage, not the following bioinformatic computations. Compute costs should does not be neglected in the considerations regarding buying a PromethION. As prices tend to drop fast for computer equipment, postponing any unnecessary upgrades could save you a lot of money or give you much more compute power for the same amount. We therefore planned to buy a “cheap” storage server (for now) with the specs below to hopefully meet the needs for the configuration unit and pass the test.
- 768 GB ram
- 2 x Intel Xeon 2650v4 (12 cores each)
- 768gb DDR4 ram 2400MHZ
- 2 x 400gb SSD (for the OS)
- 16 x 8TB NLSAS (12gbps)
- 2 x 10gbit sfp+ fibre ports
We plan to upgrade our entire compute facility when we get a better overview of the true needs for running the sequencing and bioinformatics. With PromethION level output of signal data we do not expect that we will be able to store or upload the raw data files to the read archives in long term, but would hopefully obtain fastq or fasta files as early as possible and discard the raw signals. Re-sequencing samples can probably end up being a lot cheaper than storing raw signal data.
Currently, we are working with our IT support department to get everything connected and hope to be able to share a “hello world” from the PromethION soon!
The purpose of this blog is to give a glimpse into the Albertsen Lab. The blog will be managed by all lab members that will blog on what they are currently working on or what they find exciting. A few key-words on upcoming blog themes:
Oxford Nanopore PromethIon unboxing; 3D printing DNA extraction devices; Full length SSU rRNA sequencing on the MinION; Life as a master student; Automated sample preparation using Oxford Nanopore Voltrax; Hunting for novel diversity; Long-read metagenomics; Making bioinformatic analysis accessible using Shiny.
That's it for now! I look forward to an exciting 2017!