Centre for Geobiology

High microbial diversity or simply incorrect interpretation of data – how can we know for certain?

DNA-sequencing gives us the possibility to study the genetic code of all living organisms. Many researchers today take advantage of developments in molecular biology technology that have led to the development of massive parallel pyrosequencing.

Pyrosequencing technique generates enormous amounts of sequence data in a relatively short time. The tremendous amount of data generated can contain small errors called sequence noise that can be difficult to identify. Researchers from the University of Glasgow and Newcastle together with PhD student Anders Lanzén have developed an algorithm called "PayroNoise" that effectively reduces the noise from pyrosequencing data sets. The algorithm is presented in Nature Methods (9 Aug).

In last week's BIOInfo Jarl Giske explains that "(rough translation) before the pyrosequencing technique was developed, molecular biologists studied the genetic sequence of one species at a time. It was not so many years ago that the first full genome for a species was mapped. Pyrosequencing makes it possible to study the genes from many species at the same time. It thus has opened up new possibilities for analyzing the biodiversity of a microbial community. The technique has revealed amazingly high levels of biodiversity in many different environments. However, there is an inherent challenge with the technique; it often includes data errors that many lead to biodiversity estimates that are often higher than the "real" values. The PyroNoise algorithm involves a series of applications that overcome this data noise."

Lanzén is a PhD student at the Centre for Geobiology and has worked at the Computational Biology Unit. His supervisor is Lise Øvreås.

Link to article.