- Is MotifLab still being maintained and updated
-
Technically, yes (as of July 2024), but we don't have explicit funding
for this work, so the progress is very slow. However, we hope to
officially release version 2 of MotifLab sometime during 2024.
- Downloading feature data takes a long time. Is there any way to
speed up this process?
-
MotifLab downloads data for feature tracks one sequence at a time,
which means that downloading data for many sequences can potentially
be quite time-consuming. Also,
most of the preconfigured feature data tracks available in MotifLab are downloaded from
the UCSC Genome Browser, which was
designed for interactive use and does not handle heavy program-driven
access well. In fact, the UCSC web site suggests that automatic download of data should be limited to one download per
15 seconds and a maximum of 5000 downloads per day. Other sites also have
similar restrictions, which is why MotifLab can be configured to wait
a short period of time between each request made
to avoid overloading the servers. To change the setting of this
waiting period, go to the "Configure" menu, select "Configure
Datatracks..." and press the "Configure Server Settings" button.
The waiting period can be specified individually for each server in the "Delay (ms)" column.
The default setting for tracks from the UCSC Genome Browser is to wait
3 seconds (3000 ms) between each sequence, but this can be changed to
a smaller number (even 0 ms) to speed up the downloading process.
If you have your own local mirror of the UCSC Genome Browser, you can
specify that this should preferentially be used instead of the normal
server for all tracks originating from UCSC. See
the tips & tricks section for how to
accomplish this.
It is also possible for MotifLab to make several concurrent
download requests by going to the "Configure" menu and select
"Options..." to bring up the Options-dialog. From the "General" tab you can
increase the number of "Maximum Concurrent Downloads" (which defaults
to 1). Note that increasing the number of concurrent downloads or
skipping the waiting period between server requests will increase the
strain on servers, so these options should be used with caution.
If you demand fast data access for a large number of sequences,
it can be wise to setup your own local mirrors for the tracks you need.
(We have already done so for a few of the most commonly used tracks
from the hg18 and mm9 builds for your convenience).
Also, make sure to have the data caching option activated (under the
"Cache" tab in the Options-dialog). This will ensure that data which
has already been downloaded once from external servers are stored
locally and will be readily available the next time you want to access
it.
In MotifLab version 2.0+ you can set up data sources that can retrieve
information from files on your local computer containing genome-wide
sequence data in compressed binary formats such as BigBED, BigWIG and 2bit.
This option is by far the most efficient way to import data into
MotifLab and is especially recommended if you want to analyze large datasets.
- Can you add support for the motif/module discovery
method "InsertNameOfYourFavoriteMethodHere" ?
-
If the method in question can be downloaded and executed locally
and it has a relatively simple command-line interface, it should be
rather straightforward to use it with MotifLab. If the method reads
input from standard bioinformatics formats, like FASTA (which most
methods do), and outputs results to standard formats (like GFF), all
that needs to be done is to define the input and output parameters of
the program in an XML-based configuration file. If, however, the
output from the program is in some non-standard format (which is often
the case), we also need to write a parser for it in MotiLab. If you
contact
us we will be happy to assist you with this work.
Note, however, that if the program does not specify all its parameters on
the command line, but rather relies on a separate configuration-file to
provide the parameter values, it can not currently be used with
MotifLab. (But hopefully this will be remedied in the future).
See also
this page.
|