[ Program Manual | User's Guide | Data Files | Databases ]
PlotFold displays the optimal and suboptimal secondary structures for an RNA molecule predicted by MFold.
MFold uses the method of Zuker (see the MFold entry in this manual for more information) to determine optimal and suboptimal secondary structures for an RNA molecule. MFold writes an output file containing energy matrices that determine all optimal and suboptimal foldings for the RNA molecule. PlotFold reads this output file and displays representations of the optimal and suboptimal foldings.
PlotFold allows you to choose from among eight different representations of the optimal and suboptimal foldings determined by MFold. Each of these options allows you to specify an energy increment, which is the highest deviation in kcal/mole from the computed free energy minimum that a structure can have in order to be plotted. For example, if the predicted optimal folding has a computed free energy of -114.5 kcal/mole, and you select a 5.7 kcal/mole energy increment, then all plotted structures must have calculated free energies no greater than -108.8 kcal/mole (-114.5 + 5.7 = -108.8).
The first two folding representations, the energy dotplot and the p-num plot, plot base pairing information from all secondary structures that have free energies within the energy increment you specify. You can use these displays to determine which regions of the secondary structure prediction are well defined (see the example sessions below).
The energy dotplot indicates all of the base pairs involved in all optimal and suboptimal secondary structures within the energy increment you specify. The plot takes the form of a two-dimensional graph where both axes of the graph represent the same RNA sequence. Each point drawn in the graph indicates a base pair between the ribonucleotides whose positions in the sequence are the coordinates of that point on the graph.
The p-num plot graphs the amount of variability in pairing at each position in the RNA molecule. For each position of the sequence along the horizontal axis, the height of the plot indicates how many different pairing partners the program finds in all predicted foldings within the energy increment you specify.
The remaining six PlotFold options plot a sampling of specific secondary structures that have calculated free energies within the energy increment you specify. These remaining options all display the same information, but in different forms. Rather than attempting to plot each secondary structure whose computed free energy falls within the energy increment, PlotFold selects representative foldings that are sufficiently different from each other and are still within the specified energy increment. You can specify how different each folding must be from the others in response to the window size program prompt. To understand the concept of a window size, we must first define the idea of a distance between base pairs in different foldings. The distance between two base pairs, r(i)-r(j) and r(i')-r(j') is the greater of |i - i'| and |j - j'|. Each listed folding must have at least a window size number of base pairs that are greater than window size distance from the base pairs in any other listed folding.
Each of these five PlotFold options also lets you select the number of structures to plot. Since the number of structures that meet both the energy increment and window size criteria may be less than your selection, fewer secondary structures may actually be plotted.
The circles plot makes a circular Nussinov plot of an RNA secondary structure. The circular graph represents the sequence as a segment of a circle. You can set the radius and the angular width of one base so that plots of different secondary structures are strictly comparable. Arcs or chords connect paired bases; hairpin, bulge, interior, and multibranched loops are easily seen.
The domes plot represents a folded RNA sequence as a line with elliptical arcs connecting paired bases. This representation has the property that the length of the arc is proportional to the distance (along the primary structure) between the bases; all loops are easily seen.
The mountains plot makes a graph that looks like a mountain range. Horizontal striations upon a particular peak are bonds between paired bases, and vertical links between the horizontal striations represent stems.
The squiggles plot is a representation similar to what you might draw by hand; that is, bonds formed between bases are drawn as chords. Bases are shown participating in stems, as well as in hairpin, bulge, interior, and multipbranched loops.
The text output representation of the RNA secondary structure is similar to the squiggles plot, but you don't need a graphics device to see it. The output is written into a text file that you can view with any text editor. If you exclude a region of the RNA molecule from folding in MFold with either the -CLOSedexcise or -OPENexcise parameter, you can view the predicted secondary structures only as text output; do not use any of the graphic plotting options of PlotFold to display the results.
The Connect file is a base-by-base text output file of optimal and suboptimal RNA secondary structures. This file can be used as input to several programs that produce graphical representations of RNA secondary structures. In the Wisconsin Package(TM), the Squiggles, Circles, Domes, and Mountains programs can read the Connect file output of MFold and display the secondary structure of the optimal structure, only. Other publicly available RNA secondary structure display programs may be able to display all optimal and suboptimal secondary structures listed in the Connect file.
The examples below demonstrate three different PlotFold options for the secondary structure representations.
The first example is a session using PlotFold to display an energy dotplot of the secondary structures determined for an Alu consensus sequence in the example session with MFold. The plot indicates all base pairs in all foldings with predicted free energies that are within the increment of the predicted optimal folding energy you specify.
% plotfold
PLOTFOLD with what saved energy matrix file ? alucons.mfold
Maximum size of interior loop = 30
Maximum lopsidedness of an interior loop = 30
Do you want to display:
SURVEY OF OPTIMAL AND SUBOPTIMAL FOLDINGS
A) sub-optimal energy plot
B) p-num plot
SAMPLING OF OPTIMAL AND SUBOPTIMAL FOLDINGS
C) circles
D) domes
E) mountains
F) squiggles
G) text output
H) connect file output
Please choose one (* A *):
Energy of optimal structure = -114.5
Plot base pairs at what energy increment (* 5.7 *) ?
How many color levels in the energy plot (* 1 *) ?
The minimum density for a one-page plot is
331.8 bases/100 platen units on each axis.
What point density would you like (* 331.82 *) ?
PLOTFOLD will take 1 pages. Would you like to:
P)lot the points
D)ifferent density
Q)uit
Please select one (* P *):
When your LaserWriter attached to tty07 is ready, press <Return>.
P)lot the points
D)ifferent density
Q)uit
Please select one (* Q *):
%
If you are reading the Program Manual, the plot from this session is shown in the figure below.
Points drawn in the upper triangular plot represent predicted base pairs in the RNA molecule. For example, a point drawn at position 267 on the vertical axis and position 200 on the horizontal axis indicates a base pair between ribonucleotides 267 and 200 in the sequence.
The plot displays all base pairs in all optimal and suboptimal foldings within the energy increment you specify. On a color plot, you can display base pairs from foldings at different levels of suboptimality with different colors. Black is reserved for base pairs involved in optimal foldings, and other colors are used for base pairs in structures at different levels of suboptimality. The color legend indicates the maximum free energy (in kcal/mole) for each color.
You can use the energy dotplot to determine which regions of the secondary structure prediction are well defined. Well-defined regions are those that have the least variability in all predicted secondary structures. For example, if you draw a vertical line from position 70 on the horizontal axis, and draw a horizontal line from position 70 on the vertical axis, the lines cross only one point on the graph. (If you are reading the Program Manual, this is shown in the figure below.) This means that when the ribonucleotide at position 70 is paired, it has the same pairing partner (at position 94) in all predicted optimal and suboptimal foldings within 5.7 kcal/mole of the computed optimal folding.
The second example is a session using PlotFold to display a p-num plot of the secondary structures determined for an Alu consensus sequences in the example session with MFold. This plot shows the amount of variability in pairing at each position in the sequence in all predicted foldings within the increment of the optimal folding energy you specify.
% plotfold -INfile=alucons.mfold -MENu=b -Default %
If you are reading the Program Manual, the plot from this session is shown in the figure below.
For each position of the sequence along the horizontal axis, the height of the plot indicates how many different pairing partners are found in all predicted optimal and suboptimal foldings within the energy increment you specify. You can use this information to help determine which regions of the secondary structure prediction are well defined. For example, the ribonucleotide at position 117 forms base pairs with 19 different bases in all computed secondary structures within 5.7 kcal/mole of the optimal folding. This great variability indicates that the pairing at this position may not be reliably determined. On the other hand, the ribonucleotide at position 36 forms no base pairs in all secondary structures within 5.7 kcal/mole of the computed optimal folding. This nucleotide may be reliably determined to be single-stranded in the "correct" folding.
The last example is a session using PlotFold to display mountains plots of optimal and suboptimal foldings determined for an Alu consensus sequences in the example session with MFold. The program plots representative secondary structures that satisfy the energy increment and window size criteria you specify.
% plotfold -INfile=alucons.mfold -MENu=e -MONitor -Default
Structures plotted: 3
%
If you are reading the Program Manual, the plot from this session is shown in the figure below.
Note that, although we requested 25 structure plots by default, the program plots only the three different structures that satisfy both the energy increment and window size criteria.
PlotFold accepts the energy matrix output file from MFold as input. This file cannot be read by eye. The energy matrix output files created by MFold in Version 7 of the Wisconsin Package cannot be read as input by PlotFold in later versions of the Wisconsin Package. To read Version 7 MFold output files, use the program OldPlotFold.
MFold predicts optimal and suboptimal secondary structures for an RNA molecule using the most recent energy minimization method of Zuker. PlotFold displays the optimal and suboptimal secondary structures for an RNA molecule predicted by MFold. FoldRNA predicts a single optimal secondary structure for an RNA molecule by the older method of Zuker. Circles, Domes, Mountains, Squiggles, and DotPlot all make graphic secondary structure representations with the .connect output file from FoldRNA and PlotFold.
The RNA secondary structure prediction algorithm and the folding energies used by MFold are more refined than the algorithm and energies used by FoldRNA. You cannot use the MFold energy files (see the LOCAL DATA FILES topic, below) with FoldRNA.
StemLoop finds all possible stems (inverted repeats) above some minimum quality that you can set, but StemLoop cannot recognize a structure with gaps (bulge loops or uneven bifurcation loops). The stems can be plotted with DotPlot.
If you exclude a region of the RNA molecule from folding in MFold with either the -CLOSedexcise or -OPENexcise parameter, you can display the predicted secondary structures using only the text output option (menu option G); do not use any of the graphic plotting options of PlotFold to display the results.
If you determined an energy matrix of RNA secondary structures in MFold using either the -FORCe or -CLOSedexcise command-line options, the energy of the optimal structure displayed in the PlotFold program prompt will be incorrect. Since the default energy increment is determined as a fraction of the energy of the optimal structure, the default energy increment will also be incorrect. To determine the correct optimal structure energy, first select one of the PlotFold options that plot a sampling of specific secondary structures (options C - H); next choose an energy increment that is equal to the reported energy of the optimal structure; and finally choose to plot only a single structure. When this single structure is plotted, its energy is reported correctly. You can then, in a subsequent run of PlotFold, use this knowledge of the correct optimal structure energy to specify an appropriate energy increment instead of accepting the incorrect default.
In Dr. Zuker's original program, you can select any base pair in the energy dotplot and the program then computes the folding of lowest free energy that includes that base pair. The GCG version of MFold currently does not include this feature.
The Wisconsin Package must be configured for graphics before you run any program with graphics output! If the % setplot command is available in your installation, this is the easiest way to establish your graphics configuration, but you can also use commands like % postscript that correspond to the graphics languages the Wisconsin Package supports. See Chapter 5, Using Graphics in the User's Guide for more information about configuring your process for graphics.
If you need to stop this program, use <Ctrl>C to reset your terminal and session as gracefully as possible. Searches and comparisons write out the results from the part of the search that is complete when you use <Ctrl>C. The graphics device should stop plotting the current page and start plotting the next page. If the current page is the last page, plotters should put the pen away and graphic terminals should return to interactive mode.
All parameters for this program may be put on the command line. Use the parameter -CHEck to see the summary below and to have a chance to add things to the command line before the program executes. In the summary below, the capitalized letters in the parameter names are the letters that you must type in order to use the parameter. Square brackets ([ and ]) enclose parameter values that are optional. For more information, see "Using Program Parameters" in Chapter 3, Using Programs in the User's Guide.
Minimal Syntax: % plotfold [-INfile=]alucons.mfold -Default
Prompted Parameters:
-MENu=A energy dotplot
B p-num plot
C circles plot
D domes plot
E mountains plot
F squiggles plot
G text output
H connect file output
Energy Dotplot (A)
Prompted Parameters:
-INCrement=5.7 energy increment at which to plot base pairs
-LEVels=1 color levels of suboptimality
-DENsity=331.82 number of bases per 100 platen units
Optional Parameters:
-NOCAPtion suppress the caption
-NOLABels suppress all labels except for ticks
-TICKNUMbering=bc where to place tick numbering (only with -NOLABels)
a=bottom b=right c=top d=left
-TICKAXes connect ticks with a solid axis
-POIntcolor=1 set color for the points
-SYMbol=0 set symbol to be plotted (points by default)
-SYMBOLHeight=0.18 set height of centered symbols in platen units
-DOTSonly suppress connect adjacent points with a line
-NOAXis suppress drawing an axis of symmetry
P-Num Plot (B)
Prompted Parameters:
-INCrement=5.7 energy increment at which to plot base pairs
-DENsity=252.2 number of bases per 100 platen units
Circles Plot (C)
Prompted Parameters:
-INCrement=5.7 energy increment to plot secondary structures
-LIStsize=25 maximum number of structures to display
-WINdow=5 minimum "distance" between any plotted foldings
-ANGleperbase=1.2241 degrees of arc given to each base
-RADius=45.0 radius of circle
Optional Parameters:
-SHOwseq show the sequence in the plot
-NUMbering[=10] display sequence numbers every 10th base
-NOTICks suppress the ticks and their numbers
-CHOrds connect paired bases with chords instead of arcs
Domes Plot (D)
Prompted Parameters:
-INCrement=5.7 energy increment at which to plot secondary structures
-LIStsize=25 maximum number of structures to display
-WINdow=5 minimum "distance" between any plotted foldings
Optional Parameters:
-SHOwseq show the sequence in the plot
-NUMbering[=10] display sequence numbers every 10th base
-NOTICks suppress the ticks and their numbers
-DENsity=207.14 sets the number of bases per 100 platen units
-MINortomajor=0.8 ratio between the axes of the ellipse
-RECtangles plot rectangle instead of ellipses
-PEAks plot diamond peaks instead of ellipses
Mountains Plot (E)
Prompted Parameters:
-INCrement=5.7 energy increment at which to plot secondary structures
-DENsity=331.82 number of bases per 100 platen units
-LIStsize=25 maximum number of structures to display
-WINdow=5 minimum "distance" between any plotted foldings
Optional Parameters:
-SHOwseq show the sequence in the plot
-NUMbering[=10] display sequence numbers every 10th base
-NOTICks suppress the ticks and their numbers
-STEMdepth=45 number of stems on the Y axis of each page
Squiggles Plot (F)
Prompted Parameters:
-INCrement=5.7 energy increment at which to plot secondary structures
-LIStsize=25 maximum number of structures to display
-WINdow=5 minimum "distance" between any plotted foldings
Optional Parameters:
-SHOwseq show the sequence in the plot
-SHOwseq[=32,45] specify a range of the sequence to be shown
-SEQHeight=0.9 height for sequence display and numbering
-NUMbering[=10] display sequence numbers every 10th base
-PIVot=i,j,theta pivot the substructure beginning at i and ending
at j theta degrees
Text Output (G)
Prompted Parameters:
-INCrement=5.7 energy increment at which to plot secondary structures
-LIStsize=25 maximum number of structures to display
-WINdow=5 minimum "distance" between any plotted foldings
Optional Parameters:
-LINesize=80 sets the number of characters per line
Connect File Output (H)
Prompted Parameters:
-INCrement=5.7 energy increment at which to save secondary structures
-LIStsize=25 maximum number of structures to save
-WINdow=5 minimum "distance" between any saved foldings
Optional Parameters: None
All GCG graphics programs accept these and other switches. See the Using
Graphics chapter of the USERS GUIDE for descriptions.
-FIGure[=FileName] stores plot in a file for later input to FIGURE
-FONT=3 draws all text on the plot using font 3
-COLor=1 draws entire plot with pen in stall 1
-SCAle=1.2 enlarges the plot by 20 percent (zoom in)
-XPAN=10.0 moves plot to the right 10 platen units (pan right)
-YPAN=10.0 moves plot up 10 platen units (pan up)
-PORtrait rotates plot 90 degrees
PlotFold is an adaptation of part of the mfold package by Zuker and Jaeger (see the MFold entry in the Program Manual) that incorporates GCG routines to display representations of RNA secondary structures.
We thank Dr. Zuker, not only for making his work available to GCG, but also for helping us incorporate his work into the Wisconsin Package(TM).
GCG is allowed to distribute a GCG-compatible implementation of MFold under a license agreement with the National Research Council of Canada, Institute for Biological Sciences, Ottawa, Canada, K1A 0R6 (613)-993-4830. The copyright to MFold, however, belongs to the Government of Canada. If you use MFold for published research, cite Dr. Zuker's Science paper (see the MFold entry in the Program Manual for the appropriate reference). Any communication of the MFold program must be approved by the National Research Council of Canada.
None.
The parameters listed below can be set from the command line. For more information, see "Using Program Parameters" in Chapter 3, Using Programs in the User's Guide.
suppresses the blue divider box and the text to its left.
suppresses all of the labels except for the tick labels. Ticks are labeled with numbers on the right and top sides, unless you specify different sides to be numbered. See -TICKNUMbering below. Note that -FASt suppresses all text.
With -NOLABels, you can choose which axes should have their ticks numbered. The letter codes are as follows: a=bottom, b=right, c=top, and d=left.
connects the ticks with a solid axis. Usually, GCG programs draw ticks floating in space.
defines the color for the points as follows: Black=1, Green=2, Blue=3, and Red=4.
defines a centered symbol to be used for every point. The available symbols are Point=0, Square=1, Octogon=2, Triangle=3, +=4, X=5, Diamond=6, *=7, and |=8.
defines the height for symbols (other than points) in units of one percent of the plotter's vertical axis (one platen unit).
When several adjacent points occur on a diagonal at the same level of suboptimality, MFold speeds up the plot by connecting them with a line. This parameter forces MFold to avoid this shortcut and plot all of the dots.
suppresses drawing an axis of symmetry along the central diagonal of the plot.
prints the sequence around the circumference.
This program tries to number the ticks on each axis at an interval that gives about three to six numbered ticks. Use this parameter to set the numbering interval to please yourself. You can suppress tick numbering altogether with -NONUMbering.
suppresses the ticks and their numbers. -TICks is the default.
connects bases with straight lines. Normally Circles uses circular segments to connect bases that are paired.
prints the sequence itself below the number line.
This program tries to number the ticks on each axis at an interval that gives about three to six numbered ticks. Use this parameter to set the numbering interval to please yourself. You can suppress tick numbering altogether with -NONUMbering.
suppresses the ticks and their numbers. -TICks is the default.
sets the number of bases or amino acids per 100 platen units (PU). This is usually equivalent to the number of bases or amino acids per page. Output from different GCG graphics programs that are run at the same density can be compared by lining up the plots on a light box.
sets the aspect of the ellipses, rectangles, or diamonds by setting the ratio of the minor to major axes. The default is 0.8. A ratio of 1.0 makes the ellipses into perfect circles and the rectangles into squares. You can set an aspect ratio for the plot that pushes the label and the axis off the platen completely.
draws rectangles instead of ellipses to connect the paired bases.
draws diamonds instead of ellipses to connect the paired bases.
replaces the dots around the plot with letters showing the bases. -NOSHOwseq is the default.
This program tries to number the ticks on each axis at an interval that gives about three to six numbered ticks. Use this parameter to set the numbering interval to please yourself. You can suppress tick numbering altogether with -NONUMbering.
suppresses the ticks and their numbers. -TICks is the default.
is the number of stems that can be stacked on the y axis on each page. The default is the number calculated to fit on one page.
labels the bases. -SHOwseq used without the optional values provides a labeling letter for each base. -SHOwseq specified with values allows you to label only a particular substructure of interest. -NOSHOwseq is the default.
is the character size for base and number labels. The default is 0.9 platen units, and the allowable range is from 0.2 to 5.0.
puts a sequence number as a label at every fifth base. The default interval is 10. -NONUMbering suppresses the tick numbering.
allows you to pivot stems or other substructures to make the graph more readable and fix collision between stems. The first number, where the pivoting begins, should be the underarm of an arm to be bent. The second number should be the corresponding shoulder. The third number is the number of degrees (-360 to 360) the structure should be rotated counterclockwise.
Overlapping intervals may be used, if you wish, to pivot a large structure at one angle and a smaller portion of that structure at another angle.
lets you set the maximum number of characters per line to any number between 40 and 255.
The parameters below apply to all GCG graphics programs. These and many others are described in detail in Chapter 5, Using Graphics of the User's Guide.
writes the plot as a text file of plotting instructions suitable for input to the Figure program instead of drawing the plot on your plotter.
draws all text characters on the plot using Font 3 (see Appendix I).
draws the entire plot with the pen in stall 1.
The parameters below let you expand or reduce the plot (zoom), move it in either direction (pan), or rotate it 90 degrees (rotate).
expands the plot by 20 percent by resetting the scaling factor (normally 1.0) to 1.2 (zoom in). You can expand the axes independently with -XSCAle and -YSCAle. Numbers less than 1.0 contract the plot (zoom out).
moves the plot to the right by 30 platen units (pan right).
moves the plot up by 30 platen units (pan up).
rotates the plot 90 degrees. Usually, plots are displayed with the horizontal axis longer than the vertical (landscape). Note that plots are reduced or enlarged, depending on the platen size, to fill the page.
[ Program Manual | User's Guide | Data Files | Databases ]
Documentation Comments: doc-comments@gcg.com
Technical Support: help@gcg.com
Copyright (c) 1982, 1983, 1985, 1986, 1987, 1989, 1991, 1994, 1995, 1996 Gentics Computer Group, Inc. All rights reserved.
Licenses and Trademarks Wisconsin Package is a trademark of Genetics Computer Group, Inc. GCG and the GCG logo are registered trademarks of Genetics Computer Group, Inc.
All other product names mentioned in this documentation may be trademarks, and if so, are trademarks or registered trademarks of their respective holders and are used in this documentation for identification purposes only.