Analysing amplicon data, how easy does it get?

Ever done amplicon DNA sequencing of the 16S rRNA gene to identify microbes? If so, then you must know about the challenge of analysing such complex data easily.

My name is Kasper, and I am currently a master student here at the Albertsens lab. When I first learned how DNA is sequenced today, I was astonished by the rapid development DNA sequencing technologies have been experiencing during the last decade. A whole human genome can now be sequenced within a day for less than $1000! The applications of DNA sequencing are countless and there are countless questions yet to be answered. But the first and most important question is perhaps… how? What to do with millions of sequences of just A, C, T and G’s? Well, that question is the foundation of a huge field within biology: Bioinformatics! Which is something we try to expand here at Albertsen Lab.

During my master thesis I have been working with ampvis to take it a step or two further. Using R for bioinformatics takes a certain skill level and I’ve spent weeks of my project at learning R in depth to be able to write R functions for my thesis. During my project, I have specifically been applying various ordination methods (such as Principal Components Analysis, Correspondence Analysis, Redundancy Analysis and more) and other multivariate statistics to activated sludge samples to identify patterns between and within danish wastewater treatment plants – more on that will follow in subsequent blog posts. However, after I spent all that time learning R I thought:

Does everyone need to spend so much time at learning how to do complex bioinformatics, at least, just to do simple data analysis?

Interactive data analysis through Shiny

If you want to do reproducible research, then yes. There is no other way. But if you just want to do brief analysis of amplicon data using basic functions of ampvis, I have done all the work for you by making an interactive Shiny app of ampvis. A shiny app is basically designed from a collection of HTML widgets custom made to communicate with an active R session, so anything that can be done in R, can be done with a shiny app. This means that you can now use ampvis using only mouse clicks! No R experience required.

Amplicon Visualiser in action

All you have to do is upload your data and start clicking around! If the data is suited for ampvis it is also suited for the app (an OTU table + metadata if any, minimal examples can be downloaded from within the app). We recommend using UPARSE for OTU clustering into an otutable. You can then do basic filtering based on the metadata columns and also subset taxa if your are looking for specific microbes. With the app is some example data from the MiDAS database with samples taken from Danish wastewater treatment plants (WWTP’s) from 2014 and 2015. As a start you can simply click analysis, then render plot to get a simple overview of all the samples taken grouped by WWTP. No harm can be done, feel free to try out every clickable element!

The app is available here:

Of course, it is open source, so the sourcecode is available on my Github. If you encounter any issues please report them here or in the comments below. Any suggestions for improvements are of course welcome.
(spoiler: Among future plans of the app is more extensive ordination with various metrics, data transformation and ordination types)