FAQ

Downloading feature data takes a long time. Is there any way to speed up this process?

MotifLab downloads data for feature tracks one sequence at a time, which means that downloading data for many sequences can potentially be quite time-consuming. Also, most of the preconfigured feature data tracks available in MotifLab are downloaded from the UCSC Genome Browser, which was designed for interactive use and does not handle heavy program-driven access well. In fact, the UCSC web site suggests that automatic download of data should be limited to one download per 15 seconds and a maximum of 5000 downloads per day. Other sites also have similar restrictions, which is why MotifLab can be configured to wait a short period of time between each request made to avoid overloading the servers. To change the setting of this waiting period, go to the "Configure" menu, select "Configure Datatracks..." and press the "Configure Server Settings" button. The waiting period can be specified individually for each server in the "Delay (ms)" column. The default setting for tracks from the UCSC Genome Browser is to wait 3 seconds (3000 ms) between each sequence, but this can be changed to a smaller number (even 0 ms) to speed up the downloading process. If you have your own local mirror of the UCSC Genome Browser, you can specify that this should preferentially be used instead of the normal server for all tracks originating from UCSC. See the tips & tricks section for how to accomplish this.

It is also possible for MotifLab to make several concurrent download requests by going to the "Configure" menu and select "Options..." to bring up the Options-dialog. From the "General" tab you can increase the number of "Maximum Concurrent Downloads" (which defaults to 1). Note that increasing the number of concurrent downloads or skipping the waiting period between server requests will increase the strain on servers, so these options should be used with caution. If you demand fast data access for a large number of sequences, it can be wise to setup your own local mirrors for the tracks you need. (We have already done so for a few of the most commonly used tracks from the hg18 and mm9 builds for your convenience).
Also, make sure to have the data caching option activated (under the "Cache" tab in the Options-dialog). This will ensure that data which has already been downloaded once from external servers are stored locally and will be readily available the next time you want to access it.

In MotifLab version 2.0+ you can set up data sources that can retrieve information from files on your local computer containing genome-wide sequence data in compressed binary formats such as BigBED, BigWIG and 2bit. This option is by far the most efficient way to import data into MotifLab and is especially recommended if you want to analyze large datasets.


Can you add support for the motif/module discovery method "InsertNameOfYourFavoriteMethodHere" ?

If the method in question can be downloaded and executed locally and it has a relatively simple command-line interface, it should be rather straightforward to use it with MotifLab. If the method reads input from standard bioinformatics formats, like FASTA (which most methods do), and outputs results to standard formats (like GFF), all that needs to be done is to define the input and output parameters of the program in an XML-based configuration file. If, however, the output from the program is in some non-standard format (which is often the case), we also need to write a parser for it in MotiLab. If you contact us we will be happy to assist you with this work.
Note, however, that if the program does not specify all its parameters on the command line, but rather relies on a separate configuration-file to provide the parameter values, it can not currently be used with MotifLab. (But hopefully this will be remedied in the future). See also this page.