Friday, 31 May 2013

Warburg's Lens: a new preprint initiative in Mathematical Oncology

Recently there has been a lot of discussion on preprints in biology, and a call for a faster dissemination of unpublished research. In response to this I have joined a couple of colleagues in creating a blog that posts and discusses new preprints in mathematical oncology. Our site is inspired by Haldane's Sieve, that serves a similar purpose in the field of population genetics.

The blog was launched only a couple of days ago, but we already have our second preprint posted, which is a paper of mine that I have already mentioned on this blog.

Without further ado I give you Warburg's Lens.

Tuesday, 28 May 2013

How do we best approach the problem of cancer, bottom-up or top-down?

I find it a useful exercise to sometimes take a step away from what one is working on and look at it from a broader (and sometimes historical) perspective. For me as a scientist this means moving from doing actual research to a stance similar to that of philosophers of science. Thinking about how and why we do science in the first place. Actually I have a keen interest in philosophy and have also done some work together with Henrik Thorén, a philosopher, colleague and friend, on the relation between emergence and complexity, concepts that seem to be perennial in the scientific literature. (We are also planning book that in-depth analyses the idea of 'complexity', but more on that in a future blog post).

The question I will try to tackle in this post is how the goals of our research is connected to levels of observation and explanation in science, and I will argue for a top-down approach where understanding "higher" levels should be prior to investigating lower-level phenomena. Or in the terms of cancer research: understand why and how cancers grow before your start sequencing their DNA.

Let start by first clarifing what I mean with research goals, and then move on to the slightly more delicate question of levels in science. The goals that scientists setup for themselves are both short-term (solving this equation/understanding this subsystem) mid-term (finishing a research program) and long-term (solving the "big" problems). I would say that my (highly unrealistic) goal is to cure cancer, or at least to improve the care of cancer patients. If we ideally assume that this goal is prior to my other goals then the structure and characteristics of the "big" problem will dictate my lesser goals. Or, in more concrete terms: Cancer kills. Why does it kill? Because it grows without bounds. Why does it grow without bounds? and how does it achieve this? Answering these questions will then define new sub-problems, and after a few rounds we are down to the questions that thousands of people are struggling with in cancer biology labs all over the world. We break down complicated questions into simpler ones, solve them and then move upwards in the hierarchy.

This account is highly idealised and in many instances it is impossible to dissect a complicated phenomena with precision. Instead we have to make assumptions and reason about how the system ought to function and behave. If we again look at cancer research then two major shortcuts have in the past served this role: firstly "cell lines are a reasonable substitute for in vivo experiments" and secondly "cancer is disease of cell division, therefore blocking cell growth will cure cancer". The latter assumption has served as a motivation for studying signalling pathways that mediate extra-cellular growth signals. If these paths of communication are damaged by genetic or epigenetic events then cells might become malignant (and there's your motivation for sequencing and microarrays).

Let's go back to the other ingredient in this argument: levels of observation and explanation. Levels in nature are so abundant that we in most cases don't even think about them. When you look at a collection of birds you see a flock, and conceptualise it as a whole, and only if you strain your mind can you identify the behaviour of single birds. Intuitively it makes sense to talk about the behaviour of the flock and describe its twists and turns. That this strategy of forming higher level structures works well in everyday life should not come as a surprise, but how does it fare as a scientific strategy? Actually in most cases it works well, and the reason seems to be that objects in the world interact strongly enough to form entities that are coherent in space and time. It makes sense to talk about and study genes since 'the gene' (a specific configuration of base pairs) is roughly constant over time, and its properties are largely independent of its spatial location. Also genes interact in lawful ways (e.g. dominance, epistasis etc.), and the fact that the field of genetics was born in an age when the actual constitution of genes, the DNA, was unknown, shows that many facts about genes can be understood without referring to its constituent parts, the nucleic acids.

The structure of scientific inquiry reflects the 'natural' levels since it makes much more sense to study objects that manifests themselves for free so to speak, instead of forcing an incongruent framework on top of the world. If we again look at cancer we identify it as a disease of the organism (a natural level if there is one in biology), we find it in a specific organ and that it forms in a certain tissue of the organ, and of a given cell type (e.g. epithelial cells). If we continue even further we identify certain cellular behaviours or phenotypes that are disregulated in the malignant cells, these are controlled by regulatory pathway, whose activity are controlled by the expression of genes. When we follow the causal chain of the disease we hence end up at the level of the gene, which is commonly viewed as the causal agent in biology. (This view is again highly idealised and the status of the gene in biology is a hot topic most commonly framed in the gene-centric view of Dawkin's vs. the holistic view of Gould). In cancer biology this chain can never be perfectly traced from patient to gene, and instead a common strategy has become to study the differences between normal and malignant cells in terms of gene expression or mutations. The job is then to trace the consequences of these differences in terms of cellular behaviour and tumour growth dynamics, to tease out the changes that are relevant for the dynamics of the system from the ones that are neutral (c.f. driver vs. passenger mutations).

This is however far from trivial, and often results in a situation where we have massive amounts of data on the genetic level that we cannot interpret or understand. A more sensible approach would be to start at the other end of the causal chain and figure out which tissue and cellular processes are responsible for the malignant growth, and then identify which pathways regulate and control these processes. If we analyse the dynamics of these pathways we can identify the genes that drive their up- or down-regulation, and hence identify the genes we need to target. The reason why we have to land at the genetic of molecular level is because a cure for cancer has to act on this level in the form of radiation, cytotoxic drugs or small-molecule inhibitors. In other words, while the cure is molecular identifying the cause of the disease requires a top-down perspective.

To me this is exactly what mathematical modelling of cancer is about. Building models of systems at different levels - be it of tissues, single cells or pathways - and analysing the behaviour of the system with the aim of identifying ways in which the dynamics can be altered in a favourable direction. An example of this is the view that tissues within our bodies are best seen as ecosystems of interacting cell types, and consequently that cancer should be viewed as a failure within this multi-species context (for a recent review of this look at this paper by +David Basanta and +Alexander Anderson).

So to conclude this post: if we want to make progress in cancer research I believe that we should get our higher level facts right before we start looking for the causal agents at the lower level. 

Thursday, 23 May 2013

Travelling wave analysis of a mathematical model of glioblastoma growth

Spurred by recent discussions with +Jacob Scott about preprints in biology and fed up with the slow review process of some journals I've decided to upload my most recent paper on brain tumour modelling on arXiv (and continue to do so with future papers).

This paper is quite technical (at least by my standards) and contains the mathematical analysis of a model of glioblastoma growth that was published last year in PLoS Computational Biology. In this model the cancer cells switch between a proliferative and migratory phenotype, and it was previously shown that the dynamics of the cell-based model can be captured by two coupled partial differential equations, that exhibit (like the Fisher equation) travelling wave solutions. In this paper I have analysed this PDE-system and shown the following things:

1. With a couple of assumptions on model parameters one can obtain an analytical estimate of the wave speed.
2. In the limit of large and equal switching rates the wave speed equals that of the Fisher equation (which is what you'd expect).
3. Using perturbation techniques one can obtain an approximate solution to the shape of the expanding tumour.
4. In the Fisher equation the wave speed and the slope of the front are one to one (faster wave <--> less steep front). This property does not hold for our system.

Here's a link to the submission at arXiv.

Wednesday, 22 May 2013

Mathematical biology or Bioinformatics

Conversation overheard in an inter-disciplinary research centre of unknown location.

Molecular Biologist: So, what kind of research do you do?
Mathematical Biologist: I do modelling, mathematical modelling of cancer.
Molecular Biologist: I see, interesting. So you mean bioinformatics?
Mathematical Biologist (trying to be polite): Well, not quite. My work is more about building mechanistic models that help our understanding of different steps of tumour progression.
Molecular Biologist: I never quite understood all those statistical methods and hypothesis testing, but I'm glad someone likes it!
Mathematical Biologist (slowly losing patience): Well actually….I'm not very good with statistics either, my work is more about understanding the mechanisms at work in cancer, using mathematics.
Molecular Biologist: Oh, I think I understand now. By the way, I have some microarray-data that maybe you can have a look at.
Mathematical Biologist (squeezing through the opening elevator doors): Ok…..drop me an email.

This dialogue is fictional but draws inspiration from the many encounters and discussions I've had about my research with biologists. Usually the conversations last a bit longer than the above, and end in some sort of understanding of what my work is really about.

It's not that I'm easily offended when people think that I'm a bioinformatician, but mathematical/theoretical biology and bioinformatics are fundamentally different lines of research, with different methods and goals, and I'll try to explain why I think that is the case.

In order to illustrate my point we need to take a step back from biology and look at science from a broader perspective. The process of doing science and producing new knowledge about the world is usually termed the scientific method and can roughly be divided into: Hypotheses, Experiments, Results and Conclusions/Findings (I'm sure many philosophers of science will disagree on this, but this basic subdivision will do for my argument). The process is circular in that we start with some idea about how a certain system or phenomena is structured (i.e. a hypothesis), we then transform that hypothesis to a statement that is experimentally testable, carry out the experiment, and from the data determine if the hypothesis was true or false. This fact is added to our knowledge of the world and from our extended body of knowledge we produce new hypotheses.

In order to structure our knowledge about a phenomenon we construct theories that in a more or less formal manner codify our knowledge within a coherent framework. Mathematics is such a framework that was applied successfully first in physics, and then later in chemistry and most other natural sciences. In the language of mathematics we can transform statements in a rigorous, truth-preserving manner, moving from things that are certainly true (based on observation) to things that are possibly true (to be decided by experiment).

It is in this part of the scientific method that mathematical biology fits in. In a mathematical model we incorporate known facts, and maybe add some hypothetical ones, analyse the model and produce hypotheses that hopefully are testable in experiments (disclaimer: this is highly idealised. A lot of mathematical biology is far disconnected from experiments and more concerned with mathematical analysis, but where to draw the line between mathematical biology and applied analysis is at least to me a pointless exercise). Another equally important task for mathematical biologists is to form new theoretical constructs, and define new properties that are of relevance. An example of this is R0, the 'basic reproduction number' of a pathogen, that quantifies the number of cases one case generates on average over the course of its infectious period. It was defined by Ronald Ross when studying malaria with the aid of mathematical modelling, work that later was awarded with the Nobel Prize in Medicine in 1902.

If the role of mathematical biology is to define new concepts and generate hypotheses, where does bioinformatics fit into the process of scientific discovery? The role of bioinformatics is to structure and make sense of the other side of the scientific method; to design experiments and aid us in interpreting the outcomes. In molecular biology the days of simple experiments when measuring a single quantity was enough to prove or disprove a hypothesis are almost gone. With todays measurement techniques such as microarray, methylation probes or SNP-analysis, one is presented with quantities of data that are far beyond the reach of the human intellect. In order to decode the data, and draw conclusions we need algorithms developed by bioinformaticians. Apart from this they are also involved in the step between hypotheses and experiments, designing the most efficient and accurate ways of carrying out experiments (e.g. determining how much coverage we get with a given sequencing technique).

In my view mathematical biology and bioinformatics serve as two independent and non-overlapping disciplines that both aid the actual biologists (i.e. the experimentalists) in making the scientific method spin.

The inspiration for this post (and the figure) came from +Jacob Scott who came up with the idea when writing a recent review on mathematical modelling of metastases. Thanks!

Monday, 13 May 2013

Technology is in the driver seat and we're heading to petaland

There is little doubt that technology has had a large impact on the course of scientific discovery in the past. For example it was Tycho Brahe's development of more precise measurement techniques for astronomical observation that paved the way for Johannes Kepler's identification of regularity in the motion of the planets (i.e. Kepler's three laws of planetary motion), that eventually led to Newton's formulation of the laws of celestial (and terrestrial) mechanics. While Brahe made his observations with the naked eye, the invention of the telescope by Galileo Galilei, and his subsequent observation of the moons of Jupiter, were important events that, together with other empirical evidence
eventually toppled the Aristotelian world-view.  During the same period there was a improvement in observing not only the very distant, but also the very small. The first microscope appeared at the turn of the 16th and 17th century, and within 50 years the technology had improved to such an extent that single cells were visible to the human eye. In fact the word 'cell' was coined by Robert Hook in his book Micrographia (1665). It would take another 200 years of observation before the dynamics of cell division were observed, and another 50 years until it was understood that the chromosomes, that were being shuffled between the two daughter cells, were the carriers of hereditary information.

Since the days of the scientific revolution in the 17th century technology has advanced enormously and every aspect of human life is influenced by technological artefacts. Most of us don't have the faintest idea of how they are constructed or even how they operate, but this is not really an issue since someone knows how they are built and how to fix them if they fail. More disconcerting is the fact that we are often told that only the latest gadgets are worth owning, and that each epsilon change to a piece of technology will revolutionise its use. The need for the latest gadget might be a fundamentally human capacity, and rather of economical and political interest, but what happens when this need for new technology enters into the scientific method and our ways of doing science?

The last 30 years of biological research has been heavily influenced by advances in technology, which have lead to a massive increase in knowledge about the living world. DNA sequencing, RNA-expression measurements (i.e. microarrays) and DNA methylation measurements, just to mention a few, have allowed biologists to address questions that for a long time remained unanswered. But technology doesn't just answer existing questions it also poses new ones. Microarray measurements made it possible to map out the RNA expression levels of all the genes in the genome at once. Since all processes that occur within a human cell are affected by RNA-expression of this or that gene(s), it soon became standard practice within all branches of molecular biology to perform microarray measurements, and basically required if you wanted to publish your findings in a respected journal. The data that emerged was high-dimensional, complex, and to this date we don't have a precise understanding of how RNA-expression relates to gene regulation and protein expression. Completely ignorant of this lag between measurement, and theory and concept formation the biotech industry has continued to develop techniques with even higher coverage and larger depth. The scientific community has become drawn into this technological whirlwind and today, when we still don't have a full understanding of the microarray-data, using it is basically frowned upon, and we are asked why we didn't make use of RNA-seq, or 'Next generation sequencing' in our study.

New technology lets us probe further and deeper into living matter, and doubtless this has helped our understanding of these systems. However, considering how little we have learned about human biology from the Human Genome Project (sequencing of the entire human genome), it's tempting to speculate about where we would have been today if instead all that effort and money that went into sequencing was spent on dedicated experiments and hypothesis-driven inquiry.

Today we are producing terabytes upon terabytes of data from biological systems, but at the same time we seem to know less and less what that data actually means. I think it's time to focus on the scientific problems at hand, and try to understand and build theories of the data produced by current technology, before we rush into the next next-generation piece of technology, that in the end will just make us forget what we initially were asking. If not, it won't be long until we count our datasets in petabytes.

Monday, 6 May 2013

The principles of metastasis

I have, since the beginning of my scientific career, been under the somewhat naive impression, that science moves relentlessly forward, and that we (as scientists) slowly accumulate more and more knowledge about the world. Things are discovered, communicated and never forgotten within the realms of science. Of course this cannot be true in every single case, but reading this book by Leonard Weiss put me into first-hand contact with the loss of scientific knowledge, which struck me particularly hard since it is about a topic I myself am concerned with. This book was published for everyone to read in 1985, so what I talk about is not a straight denial of scientific fact, but rather a collective amnesia or erosion of knowledge.

Let me be a bit more specific, and mention two things that caught me by surprise when I read this book. Firstly, it was established in the 40's that cancer cells from primary tumours in the breast and prostate can travel to the vertebrae and pelvis without first passing through the capillary bed of the lung. This passage is mediated by a structure known as Batson's venous plexus, which allows for a reversal of flow in the veins, in particular during coughing of sneezing. This, at least in part, explains the predilection for breast and prostate tumours to metastasise to the bone, yet these organ pairs are today often mentioned as prime examples of the seed-soil hypothesis, i.e. the idea that cancer cells thrive in certain organs with favourable 'soil'.  This is often talked about in terms of micro-environmental compatibility, which compared to physical blood flow is quite a far fetched and complex explanation.

My second moment of surprise came when Weiss discussed the literature on cancer cell clump size distribution, meaning the size of the cancer cell clusters that enter the circulation and eventually arrest in capillary beds. I'm far from an expert in the field of metastasis but the view that I have acquired from reading review articles and papers on the topic is that cancer cells travel on their own, never collectively. This mistake from my side (or of the authors I've previously read) is made even more serious since it turns out that the success of cancer cells when they arrest in foreign organs depend very much on the clump size; larger clump of cancer cells means higher chance of forming a metastases.

I sincerely hope that the above mentioned 'surprises' are isolated occurrences, but the impression I get from reading this book is that many fundamental insights into the dynamics of the metastatic process have been lost since the 80's. When did you last hear about the transit times and arrest patterns of circulating cancer cells in different organs, intravascular growth prior to extravasation, the rate at which metastatic cells appear in the primary tumour and the number of CTCs that pass through a capillary bed?

In my view these questions lost their appeal when we entered the 'gene-centric' era with its expectation that the answer to every question in biology lies in the genome, and in deeper and yet deeper sequencing. (On a side note I think the gene-centricity in turn is driven by the love of new technology which implies that any data that is acquired with the previous technology is worthless and does not need any explanation.) When focus moved to the genome a lot of the knowledge about the physical aspects of metastasis was ignored, and when those facts had been forgotten they were re-examined in genetic terms. That said, I'm happy to witness how genetic and proteomic techniques have advanced since the publication of this book, and I can honestly say that the chapters that deal with the biochemistry of cancer have not aged well. Today we know a lot more about the biochemistry of the metastatic cascade, but on the other hand possibly less about its physical aspects.

Apart from providing a new (or rather old) perspective on the process of metastasis the book also contains something that I find lacking in the current scientific literature: the critical review. Dr. Weiss takes the time to dissect the experimental setup of the studies he reviews and identifies loop-holes and mistakes in the deduction and hence conclusions in many of the referenced papers. The typical review article of today tells a pleasant story (via numerous studies) of an important topic and sets the agenda for future research. I think this mainly happens since the author often is an authority on the topic, and hence has much to loose on criticising the techniques or methods used within the field. It is much more tempting then to show off your research field as a successful endeavour where experiments are successfully carried out and theory is steadily improving. What I would like to see is a much more critical stance when writing review articles: most experiments are good, but some are bad, and just repeating the conclusions drawn by the authors does not in my opinion advance science.

These are some of the thoughts that passed through my mind when reading Leonard Weiss' 'The principles of metastasis', an excellent companion for those who are interested in the dynamics of metastatic spread. I'll leave you with a figure from the book that I think summarises the authors view of metastases.