Understanding transcriptional regulation is critical for elucidating complex
biological processes and human diseases. The transcriptional regulation is
largely determined by the binding of transcription factors (TFs) to TF binding
motifs (TFBMs). TFs often act synergistically to form complexes and thus TFBMs
often appear in modules. Despite that many motif finding methods have been
developed, it remains a challenge to discover TFBMs and motif modules,
particularly in a genome-wide scale.
We approach the problem of discovering TFBMs from a steganographic perspective
in which some secrete messages (motifs) are embedded in a stegoscript
(genome). I will first describe an efficient, genome-wide motif finding
algorithm, called WordSpy. I will then consider the problem of motif-module
discovery, discuss our WordModuler motif-module discovery algorithm, and
present some results of cis-element modules for yeast cell-cycle regulation.
In the second part of my talk, I will discuss two applications of our motif
identification and analysis methods. The first is to understand the regulation
of modules of co-expressed genes differentially expressed in the brains of
patients of Alzheimer's disease. The result indicates that many genes that are
co-expressed in diabetes, cardiovascular diseases and Alzheimer's disease
share the same regulatory mechanism. The second application is to identify
stress-responsive microRNAs in plant by a cis-element based transcriptome
analysis. Using this method, we predict 19 microRNAs in
11 microRNA families to be inducible by cold stress in model plant
Arabidopsis. Experimental validation show that among the eleven microRNAs,
eight are differentially induced and three are constantly expressed under low
temperature. Our result expands the number of cold-inducible microRNAs from
four to eight.
|