Help text for RNA secondary structure prediction with KNetFold

Background

KNetFold is a new software for predicting the consensus RNA secondary structure for a given alignment of nucleotide sequences. It uses an innovative classifier system (a hierarchical network of k-nearest neighbor classifiers) to compute for each pair of alignment positions a "base pair" or "no base pair" prediction.

We evaluated the accuracy of the KNetFold algorithm with a set of 49 RNA sequence alignments obtained from the RFAM database. In our recent publication, we show that for this test set, the performance of the method is higher compared to the programs PFOLD and RNAalifold. We also show, that the method is able to predict pseudoknots. More detailed information can be found in our recent publication.

Quick start

For a really quick start, simply use the form filled with the example tRNA alignment and click "submit" after providing your email address. If you want to compute a prediction using your own set of sequences, you need to paste a set aligned RNA sequences (in FASTA format) into the sequence text area of the form below. Providing the email address is optional but recommended, because computing the results can take more than an hour (depending on the length of the submitted alignment). The use of the options is explained below.

Retrieving results

There are 3 different ways to obtain the compute results: Please note: if you did not provide an e-mail address, you have to store the job id or keep the page generated upon query submission open in you browser!

Options

Filter options

We offer two different schemes for mapping the matrix representing a contact prediction into one unique secondary structure. The "winner takes all filter" was used for computing the results in our 2006 publication. It is fast and works fine, but in some instances it leads to the prediction implausible pseudoknots. For this reason we now offer a type of distance geometry algorithm that filters out sterically impossible pseudoknots.

Minimum stem length

This option requires all stems of a predicted secondary structure to have a certain minimum length. When evaluating the accuracy of KNetFold, we found that the prediction accuracy is slightly higher if the stem length is not restricted (minimum stem length of one). However, the RFAM alignments used for the evaluation are generally of high quality. Using alignments consisting of only a few sequences can lead to predicted spurious single base pairs. Because of this, the default for this option is set to a minimum stem length of 2.

Reference

E. Bindewald, B.A. Shapiro:
RNA secondary structure prediction from sequence alignments using a network of k-nearest neighbor classifiers.
RNA. 12(3):342-352 (2006). HTML PDF PubMed

Acknowledgments

This server was developed in the research group of Dr. Bruce A. Shapiro. This server is hosted by the Advanced Biomedical Computational Science (ABCS)of the National Cancer Institute (Frederick Campus).