Monitoring in vivo folding of RNA across an entire genome

We developed a new approach to map the structure of RNA across the entire genome of any living organism. Our application of this method to the model plant Arabidopsis thaliana revealed novel insights into RNA splicing, alternative polyadenylation, translation, and RNA structure-function relationships.

HFSP Program Grant holders Philip Bevilacqua and Sarah Assmann and colleagues
authored on Mon, 28 April 2014

RNA is central to myriad biological phenomena and its fold often dictates gene expression.  Much is known about how RNA folds in vitro, but little is understood about RNA folding in vivo.  A few studies have provided insights into the folding of RNA in vivo through chemical probing of RNA.  However, these studies were limited to relatively few, highly abundant RNAs. 

We developed a new approach to address these limitations.  We provide a way to probe RNA structure in vivo across the genome.  The system of interest (in our case, seedlings of A. thaliana) is treated with dimethyl sulfate (DMS), which modifies non-base paired and accessible A and C residues on the Watson-Crick face. This chemical treatment is followed up with reverse-transcription, in which reverse transcriptase is unable to advance past the position of modification.  Adaptors are added to the ends of the cDNA, allowing amplification followed by next-generation sequencing, which provides a read-out of the position of each DMS modification. This information is then used to constrain RNA structure prediction, indicate RNA structure in vivo, and reveal meta RNA structural properties.

(click image to enlarge)

Figure: Example of an RNA that folds very differently in silico and in vivo.  All data are from At1g55330, which encodes a putative arabinogalactan protein.  Left: Unconstrained in silico fold of the RNA.  Center: In vivo fold of the RNA obtained using DMS reactivity to constrain the fold.  Right: Comparison of in silico and in vivo folds.  Base pairs predicted uniquely in vivo are red; base pairs predicted uniquely in silico are black; base pairs common to both are green.  The right plot was prepared using the CircleCompare program in RNAstructure.

This approach has led to nucleotide-specific information on RNA secondary structure in more than 10,000 transcripts.  In so doing, a number of new insights have been made into RNA biology.  We found a triplet periodic repeat in DMS reactivity throughout the coding region of mRNA.  In addition, we discovered secondary structural patterns that are involved in alternative polyadenylation and alternative splicing.  Our method provides new avenues for probing RNA structure in vivo across entire genomes and for querying the effects of environmental conditions on RNA folding in any organism.


In vivo genome-wide profiling of RNA secondary structure reveals novel regulatory features.  Ding, Y., Tang, Y., Kwok, C. K., Zhang, Y., Bevilacqua, P. C. & Assmann, S. M. (2014). Nature 505, 696-700.

Pubmed link

F1000 link

Nature link

Nature Methods link

Nature Chemical Biology link