GRIFn is a novel system for interactive evaluation of functional genomic data and methods. It allows you to upload your own data, view evaluations in multiple contexts, and compare it with other published high throughput data.
To use all the features of GRIFn, you must install the Adobe Scalable Vector Graphic (SVG) plugin and be sure that you are NOT blocking pop-up windows in your browser (this is an option in your browser preferences menu).Internet Explorer is the recommended browser when running GRIFn on Windows, and Safari is the recommended browser when running GRIFn on Mac OS.
Windows
Only Internet Explorer is recommended for use with GRIFn on Windows. When prompted
by your browser (on the results page) to install SVG plugin from Adobe, agree
to download. The browser should then automatically install the plugin and display
the graph. Please do not use Firefox on Windows - the results SVG's will not
display properly with Firefox.
Mac OS 10.x
Safari is the only recommended browser when running GRIFn on
Mac OS.
GRIFn is intended to facilitate both general evaluation of functional genomic datasets and results and targeted, process-specific evaluation. From the home page, you can upload a file in one of two formats (click here for info on file formats). Type a name for your dataset (this will be used to label plots), and choose whether you want to include our collection of high-throughput reference datasets for comparison or not (we recommend you do). The default behavior from the Home page is to run both a general evaluation and a process-specific evaluation across the set of GO Slim terms. Once you are familiar with the system and would like to choose your own types of evaluation, you can go directly to the advanced evaluation options by clicking the "here" below the "Evaluate" button.

Once you click on evaluate, you will be redirected to the results page. Near the right side, you will notice the "Session status" pane which tracks the progress of any datasets you've uploaded and any evaluations you've run. The status will be updated anytime you refresh your browser window. At any time, you can click on the ID next to either a dataset or an evaluation and view the details of that evaluation. You can also choose to "Upload more datasets" or "Perform another evaluation" at any time. The session status pane is present on all pages once you've started your session.

When your dataset has been uploaded and the initial evaluations have been completed, two results panes will appear: one with results of the general evaluation and one with the results of the process specific evaluation. The general evaluation consists of a precision-recall analysis of your data and all of the reference datasets across the general functional gold standard for evaluation described in detail in our manuscript and supplementary material. So that users can understand exactly what processes are being recovered, clicking on any point of the precision-recall curve will result in a pie chart being displayed below the precision-recall graph. The pie chart communicates the distribution of true positive (TP) protein-protein pairs that were recovered across different processes represented in the biological process GO at the corresponding precision and recall. This can help the user readily identify processes that are well-represented in the data or detect significant biases toward certain processes. If you've identified a bias and would like to exclude the corresponding process from the analysis, see the section of "Adding more datasets or evaluations to your session". Click "Maximize Graph" to view the results in a separate window "Download XML-formatted Results" to view a text version of the results. Click here for short overview of how to get around SVG images.

The second result pane displays the results of the process-specific evaluation. By default, your dataset and the reference datasets are evaluated against the set of GO slim processes. See the "Adding more datasets or evaluations your session" section below on how to add more evaluations on any process you're interested in. For each dataset and process, a precision-recall analysis is done much like for the general evaluation discussed above. For this evaluation, however, only proteins involved in the particular process of interest are used in estimating the precision and recall. The results of all of these analyses are summarized with the area under the precision recall curve (AUPRC) and presented in a grid. The redder the corresponding square in the grid, the higher the precision and sensitivity for a particular dataset/GO term combination. By clicking on the square, you can retrieve that actual precision-recall curve for that process.

In addition to viewing the actual precision-recall results, you can view summary bar charts across either a single dataset or a single biological process. By clicking the gray square at the top of the column corresponding to a particular GO term, you will retrieve a plot like the one below, where the AUPRC score for each dataset is plotted next to each other for the process of interest. This allows for a direct comparison of dataset quality if you have a particular process in mind.

If you're interested in seeing how well a particular dataset does across the whole range of processes evaluated, you can click on the gray square at the left edge of each row. This will result in a bar chart like that displayed below where the AUPRC results for a dataset of interest are plotted for each of the biological processes studied.

Click "Maximize Graph" to view the results in a separate window "Download XML-formatted Results" to view a text version of the results. Click here for short overview of how to get around SVG images.
Adding more datasets or evaluations to your session
If you're interested in either uploading more datasets or running more evaluations (say on your own set of biological processes, etc.), you can click either the "Upload more datasets" link or the "Perform another evaluation" link on the "Session status" pane.
If you choose to upload more datasets, you'll be redirected to a page with the following form. Enter another identifier for the new dataset and choose the file you want to upload (see here for info on file formats). As is described in the file formats section, there are two types of files that can be analyzed with GRIFn: paired data and profile data. Profile data is simply data where each gene or protein has a set of features (i.e. gene expression measurements across a set of conditions). If you're uploading profile data, you can choose between either Pearson correlation or Euclidean distance as a similarity metric. GRIFn then computes a similarity score between all possible pairs of genes/proteins in your data.

If you choose to perform more evaluations, you will be directed to a page that has the following three forms. The first form is to configure you new evaluation. Assign a name to this evaluation (this is how this evaluation will appear in the session status pane). Next select the type of evaluation(s) you would like to perform. You select another general evaluation and/or a process-specific evaluation. If you select a general evaluation, note the text box immediately below the checkbox. If you wish to exclude any processes (GO terms) from the general evaluation, you can enter their GO ID's here. Entering a GO ID here will mean any protein pair related to that particular process will not be considered in the general precision-recall analysis.
If you select a process-specific evaluation, notice the text box immediately below that check box as well. This allows you to enter the set of processes over which you'd like to evaluate your data. Enter as many GO ID's as you wish separated by an Enter. If you enter no GO terms, GRIFn will evaluate on its default set, the set of GO slim terms.

The second step in configuring your new evaluation is to select which of your uploaded datasets you'd like to evaluate. Any datasets selected here will be included on result plots. Choose as many or as few as you wish or if you need to upload more datasets, click "Upload more datasets".

The final step in configuring your new evaluation is to select the reference datasets you would like to evaluate. Again, select as many or as few as you wish. The selected datasets will appear on all results plots.

GRIFn accepts two file formats: pair and profile data. Since GRIFn computes all precision-recall estimates on protein-protein pairs, any datasets uploaded must map to similarity scores between gene-gene pairs. The accepted format for paired data is a tab-delimited file with each line consisting of <ORF name 1><tab><ORF name 2><tab><score>. The score should be numeric but this is the only restriction. An example of data in such a format is shown below:

Note that if you use Excel to generate this file, you should save it as Text (tab-delimited).
You can also upload profile data, or data in which each gene has a value over a set of features (e.g. gene expression, etc.). Profile data should be in the standard .pcl format, which is essentially tab-delimited, with a GWEIGHT column and and EWEIGHT row to separate the data from experiment and gene labels. Please label the first column YORF and the identifiers should be valid yeast ORFs. An example profile dataset in valid format is shown below:

If you upload profile data, you must also specify what distance metric GRIFn should use for similarity between gene-gene pairs. The options are either Pearson correlation or Euclidean distance. The default is Pearson correlation (this is the option used if you upload your dataset from the GRIFn home page).
To move around in an SVG, hold the alt key and click-and-drag with the mouse to pan an SVG image. To zoom in to an SVG, hold the control key and click the mouse pointer location. Hold the control key and click-and-drag to select a region to zoom into. Hold the shift key, too, to zoom out.
For additional help or to report bugs, please email clmyers@princeton.edu. All suggestions are welcome!
