Documentation

(See also the tutorial.)



The aim of PriFi is to suggest a few primer pairs based on a DNA sequence alignment, and to give an account of the quality of the suggested primers.


PriFi lets the user either load a given alignment file (in the .aln format), or, if the user has access to the alignment program Clustalw (by the European Bioinformatics Institute), performs the alignment from a multiple-sequence file (in the Fasta format). If not, the user might obtain the alignment file using the web version of Clustalw: http://www.ebi.ac.uk/clustalw/.

PriFi runs in one of two overall modes, either a general-purpose mode or a so-called intron mode. In intron mode, the program expects one or more of the sequences in the alignment to contain special intron symbols: Before uploading his sequences, the user must substitute the introns of at least one of the input sequences with X'es following this translation code:

XXX intron, length <= 200 bp
XXXX
intron, length 201 - 500 bp
XXXXX
intron, length 501 - 1000 bp
XXXXXX
intron, length > 1000 bp

This mode is specially designed to follow the methods described in [1]. To run PriFi in general mode where sequences do not contain intron symbols, the user must click the Configure button and set the last parameter (Introns in sequences) to "no", before uploading any datafile.

There are three levels of filtering in the primer pair design process. The first filter operates on the complete alignment by delimiting the regions within which primers must be found. The second filter operates on these single primer candidates; the third on primer pairs. The way the filters work is explained below and is visualized in this figure, each filter represented by a yellow triangle (you might refer to the figure when reading the explanation):



First filter


Any primer is based on some subsection of the complete alignment. Certain alignment columns may be considered unfit to form the basis of a primer. Such columns are disregarded and instead delimit the primer regions within which primers are to be located.

The primer regions are those alignment subsections that have at least the minimum primer length (default 18) and which do not contain columns which
Less conserved regions contain many mismatch columns, so we look at those to use some of them as delimiters between conserved regions. All valid primers must have a minimum length (l) and a maximum number of mismatches (m). For each mismatch column, we check if it is possible to place a window around it of length at least l such that the part of the alignment covered by this window has at most m mismatches. If no such window can be found, the column can never be part of a valid primer, and it is masked out. After this masking procedure, the conserved regions are identified as those regions which have a length of at least l and contain no masked columns.

Second filter


Within each primer region, all possible valid single primer candidates are identified and evaluated. To avoid keeping too many candidates for further consideration, the set is pruned while we still keep the best candidates.
A valid primer candidate is an alignment subsection which obeys the following rules:

Third filter


On level three, all possible pairs of primers are ranked. We obtain a primer from a candidate alignment subsection by using the consensus sequence for that subsection, inserting ambiguity codes in mismatch columns. All primer pairs which obey the following rules are kept and scored.
Having discarded all invalid primers and primer pairs, the remaining pairs are scored and the four best pairs are reported. In fact, it is not simply the four highest ranking pairs which are reported; typically, then they would all be the practically the same primers differering only by a few nucleotides at the ends. To avoid this and ensure a certain diversity among the suggested primers, the overall best scoring pair is reported first. Then, the highest scoring pair whose primers do not overlap by more than a certain number of nucleotides (default 10) with the already reported pair is reported, etc. Moreover, if one of the primers in a pair overlaps with two individual primers already reported, the pair is disregarded.

Scoring criteria


Pairs are scored accoring to the following criteria:

(All parameters and criteria marked with * above become void when working in the general mode rather than in intron mode).


A note on self-complementarity (quoting the PriFi paper):

Evaluation of self-complementarity is currently not supported. [..] PriFi is first and foremost an attempt to capture the, to some extent, intuitive yet successful practice of our laboratory for primer design, and here, self-complementarity is not taken into account.

Using the Oligo Calculator by Qing Cao, Warren, and Buehler (http://www.basic.nwu.edu/biotools/oligocalc.html), we found that around 10% of the primers had significant regions of self-complementarity that might in theory result in self-priming during PCR. However all these primers have worked well in the laboratory.

Further, one of PriFi's users (Anne Chenuil from Centre d'Oceanologie de Marseille) has sent me this comment:

Dear Jakob

I just read the results of your paper and find out that you actually do not believe much in the self-complementarity criterion ....which I find interesting as it confirms my personal experience: I used to manually check thoroughly the primers and primer pairs for complementarity and this gave me lots of complications, though I observed that primers supposedly prone to these problems actually worked well (as well as primers with a 3' mismatch nucleotide C, by the way...)

If you have questions or comments, please contact:

Jakob Fredslund, assistant professor
BiRC - Bioinformatics Research Center
University of Aarhus
H�egh-Guldbergs Gade 10, Building 1090
DK-8000 �rhus C
Denmark
(jakobf@birc.au.dk)




[1] Jakob Fredslund, Lene H. Madsen, Birgit K. Hougaard, Anna Marie Nielsen, David Bertioli, Niels Sandal, Jens Stougaard, Leif Schauser
A general pipeline for the development of anchor markers for comparative genomics in plants
BMC Genomics 2006, 7:207


Go to the PriFi main page