Quant. Biol.    2017, Vol. 5 Issue (1) : 3-24
Comparative and integrative analysis of RNA structural profiling data: current practices and emerging questions
Krishna Choudhary,Fei Deng,Sharon Aviran()
Department of Biomedical Engineering and Genome Center, University of California at Davis, Davis, CA 95616, USA
Background: Structure profiling experiments provide single-nucleotide information on RNA structure. Recent advances in chemistry combined with application of high-throughput sequencing have enabled structure profiling at transcriptome scale and in living cells, creating unprecedented opportunities for RNA biology. Propelled by these experimental advances, massive data with ever-increasing diversity and complexity have been generated, which give rise to new challenges in interpreting and analyzing these data.

Results: We review current practices in analysis of structure profiling data with emphasis on comparative and integrative analysis as well as highlight emerging questions. Comparative analysis has revealed structural patterns across transcriptomes and has become an integral component of recent profiling studies. Additionally, profiling data can be integrated into traditional structure prediction algorithms to improve prediction accuracy.

Conclusions: To keep pace with experimental developments, methods to facilitate, enhance and refine such analyses are needed. Parallel advances in analysis methodology will complement profiling technologies and help them reach their full potential.

Author Summary  RNAs are known to play essential roles in diverse cellular functions, extending well-beyond transfer of information from genes to proteins. RNA function is closely linked to its ability to fold into and convert between specific complex structures. Determining RNA structure has thus become a crucial step in understanding its function. Structure profiling experiments provide single nucleotide information on RNA structure. Recent advances in chemistry combined with application of new high-throughput sequencing techniques have enabled RNA structure profiling at transcriptome scale and in living cells, creating unprecedented opportunities for RNA biology. In this paper, we review current practices in analysis of structure profiling data, with emphasis on comparative and integrative analysis, as well as highlight emerging questions.
Keywords RNA structure profiling      high-throughput sequencing      RNA secondary structure prediction      chemical structure probing      SHAPE-Seq     
Corresponding Author(s): Sharon Aviran   
Online First Date: 24 January 2017    Issue Date: 22 March 2017
Krishna Choudhary,Fei Deng,Sharon Aviran. Comparative and integrative analysis of RNA structural profiling data: current practices and emerging questions[J]. Quant. Biol., 2017, 5(1): 3-24.
Fig.1  Overview of structure-profiling experiments.

RNA sample of interest (at the top) is probed with a structure-sensitive reagent, which introduces a modification (red pins) preferentially at unpaired nucleotides. Degree of modification is read via reverse transcription and sequencing. Next, the readouts are mapped to reference sequences and normalized reactivities are calculated from counts summary of mapped reads. Reactivity profiles of probed RNAs are used in diverse downstream applications, some of which are listed.

Fig.2  Quality screening with SEQualyzer.

Bars represent per-nucleotide SNR and black lines represent rolling mean of per-nucleotide SNR for windows of 20 nt. SEQualyzer estimates SNR via bootstrap as described by Choudhary et al. [59]. Examination of quality profiles reveals that signal quality is good for entire RNA except a short region from nucleotides 35?53 where it is poor in all replicates. For illustration purpose, we used data for P4?–?P6 domain of Tetrahymena group I intron ribozyme from Loughrey et al. [33].

Fig.3  Comparison between MFE secondary structure and one of the suboptimal secondary structures for tRNA (asp), yeast.

(A) Reference (accepted) structure. (B) MFE structure. (C) Suboptimal structure. (D) Circular plot comparing the MFE structure in B to the reference structure in A. (E) Circular plot comparing the suboptimal structure in (C) to the reference structure in (A). Structures are predicted using the Fold program in RNAstructure package [125] with default parameters. Plots (A), (B) and (C) are prepared with VARNA [134]. Circular plots (D) and (E) are prepared with the CircleCompare program in RNAstructure. In (D) and (E), base pairs are indicated by lines. Pairs present in both the predicted and reference structures are in green; pairs which are present only in the predicted structure are in red; and pairs which are present only in the reference structure are in black.

Fig.4  Information content of SHAPE data.

Two data-directed structure prediction methods, Deigan et al.’s approach [82] and RNAprob [135], are tested on a set of 23 RNAs, as used in [135]. For RNAprob, the variant with two structure contexts and empirical decoder is used. Bars represent SLW-average MCC values of quintiles with perfect information. Upper dashed lines represent the performance with the entire struture profile set to perfect information. Solid lines indicate the performance with the original struture profile data and the bottom dashed line corresponds to the no-SHAPE control.

Full text



