Example Protocol Scripts


This page contains a selection of example protocol scripts demonstrating how MotifLab can be used to solve different analysis tasks of varying complexity. The protocols can be imported into MotifLab in three different ways:

Manual copy
Click on the link of a protocol below (in the 'Protocol' column of the table) to display the protocol in your web browser. Press CRTL+A (orCOMMAND+A on MAC) to select the entire protocol followed by CTRL+C (orCOMMAND+C) to copy the contents of the page to the system clipboard. Go to MotifLab and select "New Protocol..." from the File-menu to create a new empty protocol. Click anywhere within the protocol editor and then press CTRL+V (orCOMMAND+V on MAC) to paste the protocol into the editor.

Import protocol from file
Right-click on the link of a protocol below and select "Save Target As..." (Internet Explorer) or "Save Link As..." (Firefox) from the context menu to save the protocol to your local file system. Go to MotifLab and select "Open Protocol..." from the File-menu and use the dialog box that appears to open the file you just saved.

Import protocol from URL
Right-click on the link of a protocol and select "Copy Shortcut..." (Internet Explorer) or "Copy Link Location..." (Firefox) from the context menu. Go to MotifLab and select "Open Protocol from URL..." from the File-menu and then press CTRL+V (orCOMMAND+V on MAC) to paste the URL address into the dialog box that appears.

After a protocol has been imported into MotifLab, press the "Execute"button in the toolbar to run the protocol.

Most of the protocol scripts listed here will ask the user to specify a set of sequences to perform the analysis on if no sequences are currently loaded. However, it is usually possible to run the analysis on a precompiled set of example sequences by removing the #-sign in front of the line found near the top of the protocol that reads
#AllSequences = new Sequence Collection(File:"....")

Protocol Description Notes
Motif Count This protocol will perform motif scanning and output a table showing the number of times each motif is found in the sequences. An overrepresentation p-value is calculated for each motif by comparing the motif's observed frequency in the sequences to its expected frequency based on the number of times it is predicted in a scrambled version of the original sequences (i.e. randomly created artificial sequences with the same oligonucleotide composition as the original sequences).
Estimate expected motif occurrence frequencies This protocol will generate 50 random DNA sequences with specified length based on a chosen background model and then perform motif scanning in these sequences and calculate the occurrence frequency of each motif. These frequencies can be used as "expected frequencies" to calculate p-values for overrepresentation with the "count motif occurrences" analysis. Note that the same scanning method and settings should be used when scanning for motifs in these artificial control sequences that you used when analysing the actual target sequences.
TFBS filtering This protocol performs motif scanning in a set of sequences with motifs from TRANSFAC and then proceeds to filter out predictions according to different criteria, such as binding sites that are not conserved, binding sites overlapping with known repeat regions, binding sites that are not located within a DNase hypersensitivity site, binding sites that are not supported by a ChIP-seq peak region for the corresponding TF or binding sites that do not have sites for known interaction partners within a specified distance.
Motif Discovery Benchmark This protocol will generate 20 random DNA sequences with specified length based on a chosen background model and then plant up to 5 selected motifs at random locations in these sequences. A few de novo motif discovery methods will be run to predict the locations of the motifs in the sequences, and the performance of the methods will be evaluated. Note that in order to use this protocol you must have installed/configured the motif discovery methods used here (or rewrite the protocol to use other methods instead).
Forskolin analysis This is the protocol (slightly modified) which was used for the third use case example presented in the MotifLab publication to identify interesting motifs and binding sites in promoter regions of genes whose expression were significantly changed in response to treatment with forskolin. The protocol performs motif scanning and then finds motifs that are significantly overrepresented compared to an expected motif frequency, as well as finding motifs that have a high average conservation across all binding sites and motifs that tend to appear in the same location relative to the TSS in several sequences. Finally, results from these three different analyses are collated into a larger analysis and the motifs are ranked according to the combined rank sum of these three properties.
Simple use of positional priors to guide motif discovery This protocol demonstrates how some simple numeric tracks, where the value in each position correlates with the probability of observing a binding site in that position, can be used as "positional priors" tracks that can guide motif discovery programs towards the target motifs. Tracks such as Conservation, DNase hypersensitivity tracks and ChIP-seq tracks can be used either directly or with minimal processing (depending on the motif discovery method used). The protocol also demonstrates how operations can be used to manually create a positional priors track with a specific search focus, namely searching for additional motifs in neighborhood regions around known binding sites.