Mapfold FAQ

 

1.) What is MapFold?

Mapfold is a webserver to carry out calculations based on the ideas developed in Quarrier et al. (2010) which can help you better interpret your chemical mapping data. It can be used to predict a secondary structure, but it is really intended more for you to understand how different orthogonal data you may have collected agree with each other. Although MapFold will work with a single data set, it becomes really useful with two or three data sets.

2.) Sounds great, how do I use it?

You will have to format your data in a table similar to the one provided here. The format is pretty simple and looks like this:

Sequence Ideal DMS T1

A 1 x x

A 1 x x

U 1 x x

U 0 x x

G 0 x 0.06

C 0 0.00 x

G 0 x 0.02

G 0 x 0.03

G 0 x 0.08

A 1 0.80 x

A 1 1.00 x

A 1 0.96 x

G 0 x 0.01

G 0 x 0.08

G 0 x 0.07

G 0 x 0.06

You can make a file like this in excel by saving it as a "Tab Delimited Text File." What is important to remember are the following points.

A.) The first column should the sequence of the RNA you are interested in.

B.) The next columns are the data. In the case of the sample data, the first column is ideal data, i.e. it is the correct answer. We determined from the crystal structure of P4P6 (which is the RNA we are simulating in the example file) whether a base was paired or not. The next two columns are DMS and T1 data collected at 100 mM KCl

C.) Your data should be normalized between 1 and 0. 1 should mean unpaired and 0 paired, but you can have data between 1 and 0, in case the protection is not as well defined. Think of your data as a probability of being paired or unpaired for each base

D.) If you do not have data, you can use an x to indicate no data. For example, T1 only probes Gs so we put x for all other nucleotides in our data.

Make sure there are no blank spaces or extra lines at the bottom of your file, this can potentially affect the analysis.

3.) Ok so how to I interpret my results?

This is what the Mapfold results look like:

There are a couple important points.

A.) The predicted structure for a particular data set is plotted in the center. In this case, this is the structure that most closely agrees with your first data set ( in this case the "Ideal" data set).

B.) To the right, Mapfold reports the Manhattan distance of this structure to your different data sets. In this case this structure has a distance of 12 from the Ideal data, 12.2 from DMS and 9.77 from the T1 data. As a rule of thumb, you can divide the Manhattan distance by 2 an estimate the number of base-pairs which are wrong. Therefore, a Manhattan distance of 12 means the structure is probably off by about 6 base-pairs, a good prediction!

C.) To the left you can select all the predicted structures for the different data sets and their combinations.

4.) Wow, that is a lot of information, how do I get a take home message from this analysis?

Mapfold samples thousands of RNA structures, in fact in the mapfold home screen , seen below:

You can specify the number of structures you want to sample. The default is 1000, but you may want to try more structures to get more accurate results. Changing the 1 to a 5 will tell Mapfold to sample 5000 structures instead of 1000.

To get back to the take home message, look carefully at the distances:

What you see is that all three data (Ideal, DMS and T1) agree with this structure. Form this you can conclude that the T1 data and DMS data collected are in agreement and that the prediction Mapfold is giving you is accurate. Where one of the data sets to have a much larger distance than the others, you may want to recollect that data.