skip to content

Cancer Research UK Cambridge Institute



Visit the Markowetz lab website.

Computational biology

We develop computational methods to link genomic profiles with quantitative measures of phenotypes, leading towards a comprehensive systems genetics understanding of cancer.

Cancer is a disease of the genome and of cellular interactions in the tumour tissue. Integrative approaches to dissect this complexity can improve on the limited snapshots provided by individual experimental techniques. This is why my lab develops computational methods for the systems genetics of cancer. Systems genetics uses genomic techniques and integrates them with quantitative measures of phenotypes. Ideally, systems genetics brings together three dimensions: it combines (i) genome-wide analysis with (ii) many quantitative phenotypes, both at the molecular and organismal level, (iii) in many different conditions or environments (Figure 1).

Figure 1. Systems genetics comprehensively combines genome-wide analysis with many quantitative phenotypes, both at the molecular and organismal level, in many different conditions or environments. Systems genetics subsumes previous approaches that were focussed on linking individual loci to a single phenotype (in classical genetical and epistatic analysis) or linking many genomic loci to a single phenotype in a single condition (in GWAS or eQTL studies).

The ideal of comprehensive integration of genome × many phenotypes × multiple conditions is very hard to achieve, but it serves as a guiding principle for the research in my lab. Computational methods of data integration and network biology, like the ones we develop, are integral parts of systems genetics analysis.

Zooming in all the way from the global picture to the details

Our work combines two complementary directions:
First, in large data collections we analyse global portraits of cancer that combine tissue organisation and molecular profiles to infer cancer subtypes and predictive signatures. In this research area we have (1) dissected molecular subtypes of cancer, (2) quantified the cellular heterogeneity of tumour tissue to complement genomic profiling, and (3) related intra-patient genomic heterogeneity to survival.

Second, focussing on key mechanisms, we model their components, interactions and dynamics to understand how they are deregulated in cancer and can be targeted by drugs. In this area we have worked on systems genetics methods to put the action of individual genes and proteins into a cellular context. We have contributed to (1) understanding the mechanisms underlying GWAS hits in breast cancer and (2) the epigenetic regulation of differentiation programs, as well as (3) developing methods to infer pathway structure and its dynamic change from gene perturbation experiments.

Evolutionary trajectories of ovarian cancer

Biological interest in intra-tumour heterogeneity in solid tumours has dramatically increased over the past five years as a potential explanation for the development of relapsed disease. However, so far most studies on tumour heterogeneity have been on a single sample at a single diagnostic time point, thus underestimating the genetic complexity of tumours. No general empirical evidence has been provided that demonstrates a direct connection between heterogeneity and the development of disease progression.

Quantifying tumour heterogeneity and understanding its aetiology crucially depends on our ability to accurately reconstruct the evolutionary history of cancer cells within each patient. However, methods for objectively quantifying tumour heterogeneity have been missing and are particularly difficult to establish in ovarian cancer, where predominant copy number variation and horizontal dependencies caused by long and cascading genomic rearrangements prevent accurate phylogenetic reconstruction (Figure 2).

Figure 2. Rigorous analysis of multiple cancer samples allows accurate quantification of tumour heterogeneity. Using multiple copy-number profiles from spatially and temporally distinct sites in the same patient we compute a minimum event distance to reconstruct the life history of the tumour and to quantify intra-tumour heterogeneity.

To address this challenge, Roland Schwarz, a postdoc in the group, developed phylogenetic models, called MEDICC, applicable to copy-number profiles based on finite-state transducers, which yield more accurate results than competing methods. In collaboration with James Brenton’s lab we applied MEDICC to the analysis of 170 copy-number profiles of patients undergoing neo-adjuvant chemotherapy for HGSOC. We found that tumour heterogeneity in HGSOC is driven by ongoing clonal evolution with fully branched evolutionary trajectories that do not have clock-like evolutionary rates. We show in two patients that clonal expansion of a minor subclone that was present prior to chemotherapy led to clinical relapse. Our main result is that the quantitative measures of clonal expansion and temporal heterogeneity we have defined were the strongest predictors of progression-free survival (compared to clinical covariates like grade, age, and others). Thus, this unique dataset together with detailed evolutionary analyses allowed us for the first time to quantify the relationship between tumour heterogeneity and chemotherapy treatment. These data provide profound insights into mechanisms of resistance in HGSOC and show how quantifying heterogeneity could act as a prognostic indicator.

Inferring pathway rewiring from downstream effects of perturbations

In a methodological project Xin Wang, a PhD student in the group who is now a postdoc at Harvard Medical School, combined hidden Markov models (HMM), a well established methodology for dynamic data, with Nested Effects Models, a methodology we have pioneered, to reconstruct rewiring events in pathway topologies from time-series data derived after silencing pathway components. Inferring time-varying networks is important to understand the development and evolution of interactions over time. However, the vast majority of currently used models assume direct measurements of node states, which are often difficult to obtain, especially in fields like cell biology, where perturbation experiments often only provide indirect information of network structure. The method we propose models the evolving network by a Markov chain on a state space of signalling networks, which are derived as NEMs from indirect perturbation data. To infer the hidden network evolution and unknown parameter, we developed a Gibbs sampler, in which sampling network structure is facilitated by a novel structural Metropolis-Hastings algorithm. We show the applicability of HM-NEMs in two real biological case studies, in one capturing dynamic crosstalk during the progression of neutrophil polarization, and in the other inferring an evolving network underlying early differentiation of mouse embryonic stem cells.

Future plans

We have started as a completely computational ‘dry lab’ and are in the process of developing into a mixed lab. Using a combination of computation, theory and experiments, we will advance the two research themes successfully established in the group. In the first theme ‘global portraits of cancer’ we will analyse quantitative tissue phenotypes as an intermediate between molecular profiles and outcome, as well as develop methods to infer tumour evolution and link it to clinical variables. In our second theme, ‘components and dynamics of key mechanisms’, we will work on methodological projects in network biology and –in an integrated experimental/computational cycle– we will derive a quantitative model of oestrogen receptor binding dynamics.