Research
Genetic sequence and regulatory variation
How the functional elements that control gene expression are used to create a diversity of tissues remains poorly understood. The ultimate aim of classical genetics and modern genomics is to understand the molecular details of how the genome is deployed transcriptionally to create a diversity of tissues and species. This understanding has profound importance for cancer research, as a major hallmark of tumour progression is the occurrence of new genetic mutations and their resulting perturbation of gene expression programs. Using liver and liver cancer as model systems, we research the regulation and evolution of all forms of transcription that occur in mammals.
The control and evolution of cellular gene expression
The proteins that control DNA, known as transcription factors, bind to it in a combinatorial manner in yeast and bacteria, and my early work showed that this combinatorial binding occurs in mammalian tissues as well. Master regulators in primary human hepatocytes form a highly interconnected core circuitry that frequently bind promoter regions in clusters, particularly at highly regulated and transcribed genes (Odom et al., Mol Syst Biol 2006; 2: 2006.0017) (Figure 1). More surprisingly, we have recently found that transcriptional regulation can vary much more rapidly and widely than previously appreciated among homologous tissues from many mammals (Schmidt et al., Science 2010; 328: 1036; Odom et al., Nat Genet 2007; 39: 730). The experiments in our laboratory allowed the identification of specific genetic architectures that appear to preserve a small handful of transcription factor binding events across large evolutionary timescales (>300 million years) (Schmidt et al., Science 2010; 328: 1036).

Figure 1
Core human hepatocyte regulatory circuitry. The black ovals represent transcription factors that are required for the creation and maintenance of liver specific transcription. The red arrows represent the autoregulatory loops at the apex of core regulatory circuitry in human liver. The blue arrows represent the regulator to regulator connections that exist in vivo in human liver.
In asking why rapid variation occurs among most transcription factor binding events, we realised that a number of causative factors could contribute. These possible causes may be the result of variability of genetic sequences, the types and number of marks left in the histone proteins that package DNA (commonly thought of as an epigenetic code), or even diet or environmental differences between different species. In order to isolate a single one of these variables, we used a previously created mouse model of Down's syndrome that carries a virtually complete copy of a human chromosome (O'Doherty et al., Science 2005; 309: 2033). By exploiting this aneuploid mouse strain, a unique and powerful genetic tool designed for an entirely different purpose, my laboratory was able to determine that genetic sequence dominates all others in directing transcription (Wilson et al., Science 2008; 322: 434).
The origin, regulation, and evolution of noncoding RNAs
We have been using similar comparative functional genomics approaches to look at the regions of the genome that are transcribed, but which do not code for proteins. These regions are known as non-coding RNAs, and range from well-characterised species like tRNAs and rRNAs to newer categories of regulatory nucleic acids like microRNAs, piRNAs, and endogenously expressed RNAi. In addition, we are using the unbiased maps of transcriptional regulation we have generated to date to investigate the regulation of these molecules (Figure 2).

Figure 2
RNA polymerase III regulation of tRNA loci in six mammals.
