miRNA Precursor Candidates for Arabidopsis thaliana

What is this page?
AAUGC           A       A        A   -     C     UCUU-     CU
CUUCU           -       A        A   A     U     UCGUU     UU
Important Introductory Notes
Download Results and Programs
Search The Database
Bioinformatic Credits

What is this page?

    This page represents a resource of predicted miRNA and precursor candidates for the Arabidopsis genome predicted by the algorithm 'findMiRNA' and is also the supplementary material for the paper titled 'Computational Prediction of miRNAs in Arabidopsis thaliana'. This page is provided to give the scientific community a clean interface to all of our raw computational results. The findMiRNA algorithm has the sensitivity to detect most previously reported miRNAs (see 'The miRNA Registry' at Rfam). It should be noted that other sequences with the potential to form hairpins are also identified by this algorithm such as tRNAs, foldback elements and retrotransposons. Whether a sequence represents a miRNA precursor or not should be determined by users of this site and the data is provided unfiltered.

Important Introductory Notes

Before using this database

   Before using this database to look for potential miRNA target sites in your gene of interest, it is advisable that you already have some evidence suggesting that your gene is regulated by a miRNA. Such evidence might originate from genetic and/or bioinformatic sources. A genetic example would be a silent base substitution in the coding region (i.e. does not result in an amino acid substitution) that behaves as a dominant allele. This point mutation may be present in a miRNA target site and thus blocking negative regulation of the gene by a miRNA. A bioinformatic example would be the presence of a conserved ~20 nt sequence across a number of partially diverged genes and this may represent a conserved miRNA target site.

How to search the database

   To use this database, go to the search form and enter the transcript ID (as defined by TAIR) for your gene of interest. The default filter settings will reduce the data by 96 percent and results in an average of 3 hits per transcript. The search page input allows for the results to be limited to those precursors that have a presumptive homologue in rice as found by precExtract, an internally developed program that searches for miRNA homologues in other genomes. The sequence results can be shown as either as RNA or DNA, with RNA as the default option.

miRNA precursor families

   Many of the easily identifiable miRNA precursor families are likely to have already been discovered during the course of our work. However, it is still possible that other precursor families do exist, and therefore the input page has been set up to allow users to limit the output to records for which the candidate miRNAs overlap single target sites on the transcript, a property of precursor families. These overlapping candidate precursor sequences can also be distinguished from the other data without selecting the filter, as the target site nucleotide range is shown in green text for these candidates. After the results are returned, individual candidate precursor sequences can then be selected by checking the boxes on the left and aligned using the provided Clustalw alignment option at the base of the page (more computationally intense alignments were performed for the publication using T_COFFEE). This enables the detection of the characteristic divergence pattern that is often observed for precursor families, where the miRNA and miRNA* sequences are more conserved than the intervening sequence of the hairpin.

Target site conservation

    In addition to looking for the characteristic divergence pattern seen in precursor families, searching for conservation of target sites within related transcripts within Arabidopsis or between Arabidopsis and another species is highly recommended. Clear evidence of the presence of a conserved target site is the strongest indicator of a bona fide miRNA. Another strong evidence point is the presence of more than one copy of the same target site within a single transcript. This is consistent with cooperative regulation of the transcript by the presumptive miRNA. We recommend using PATMATCH/WU-BLAST/BLASTN at TAIR to help identify potentially conserved target sites in plant transcripts allowing a few mismatches to account for G-U pairing that may occur between the miRNA and its target. It is suggested that precursor sequences are aligned to plant genomic DNA using BLAST at NCBI as this will often identify noise such as tRNAs and can be used to detect the presence of potential miRNA precursors in other plant genomes. Subsequent hairpin prediction for these potential precursor sequences can then be performed using either RNAFold or Mfold.

Bioinformatic Credits

   Alex Adai, Cameron Johnson, Varun Manocha

If you use our data and/or software in your research, please cite this web resource, and our paper.
Computational Prediction of miRNAs in Arabidopsis thaliana.
Alex Adai, Cameron Johnson, Sizolwenkosi Mlotshwa, Sarah Archer-Evans, Varun Manocha, Vicki Vance, and Venkatesan Sundaresan. Genome Research 15:78-91, 2005
Addendum - Identification of miR395 targets by findMiRNA.


   We are extremely grateful to Edward Marcotte and the Marcotte Lab for previously hosting and supporting this site. We would also like to thank Andrej Sali of the University of California San Francisco for providing access to critical computational resources for the most recent runs of findMiRNA.