I find it a useful exercise to sometimes take a step away from what one is working on and look at it from a broader (and sometimes historical) perspective. For me as a scientist this means moving from doing actual research to a stance similar to that of philosophers of science. Thinking about how and why we do science in the first place. Actually I have a keen interest in philosophy and have also done some work together with Henrik Thorén, a philosopher, colleague and friend, on the relation between emergence and complexity, concepts that seem to be perennial in the scientific literature. (We are also planning book that in-depth analyses the idea of 'complexity', but more on that in a future blog post).
The question I will try to tackle in this post is how the goals of our research is connected to levels of observation and explanation in science, and I will argue for a top-down approach where understanding "higher" levels should be prior to investigating lower-level phenomena. Or in the terms of cancer research: understand why and how cancers grow before your start sequencing their DNA.
Let start by first clarifing what I mean with research goals, and then move on to the slightly more delicate question of levels in science. The goals that scientists setup for themselves are both short-term (solving this equation/understanding this subsystem) mid-term (finishing a research program) and long-term (solving the "big" problems). I would say that my (highly unrealistic) goal is to cure cancer, or at least to improve the care of cancer patients. If we ideally assume that this goal is prior to my other goals then the structure and characteristics of the "big" problem will dictate my lesser goals. Or, in more concrete terms: Cancer kills. Why does it kill? Because it grows without bounds. Why does it grow without bounds? and how does it achieve this? Answering these questions will then define new sub-problems, and after a few rounds we are down to the questions that thousands of people are struggling with in cancer biology labs all over the world. We break down complicated questions into simpler ones, solve them and then move upwards in the hierarchy.
This account is highly idealised and in many instances it is impossible to dissect a complicated phenomena with precision. Instead we have to make assumptions and reason about how the system ought to function and behave. If we again look at cancer research then two major shortcuts have in the past served this role: firstly "cell lines are a reasonable substitute for in vivo experiments" and secondly "cancer is disease of cell division, therefore blocking cell growth will cure cancer". The latter assumption has served as a motivation for studying signalling pathways that mediate extra-cellular growth signals. If these paths of communication are damaged by genetic or epigenetic events then cells might become malignant (and there's your motivation for sequencing and microarrays).
Let's go back to the other ingredient in this argument: levels of observation and explanation. Levels in nature are so abundant that we in most cases don't even think about them. When you look at a collection of birds you see a flock, and conceptualise it as a whole, and only if you strain your mind can you identify the behaviour of single birds. Intuitively it makes sense to talk about the behaviour of the flock and describe its twists and turns. That this strategy of forming higher level structures works well in everyday life should not come as a surprise, but how does it fare as a scientific strategy? Actually in most cases it works well, and the reason seems to be that objects in the world interact strongly enough to form entities that are coherent in space and time. It makes sense to talk about and study genes since 'the gene' (a specific configuration of base pairs) is roughly constant over time, and its properties are largely independent of its spatial location. Also genes interact in lawful ways (e.g. dominance, epistasis etc.), and the fact that the field of genetics was born in an age when the actual constitution of genes, the DNA, was unknown, shows that many facts about genes can be understood without referring to its constituent parts, the nucleic acids.
The structure of scientific inquiry reflects the 'natural' levels since it makes much more sense to study objects that manifests themselves for free so to speak, instead of forcing an incongruent framework on top of the world. If we again look at cancer we identify it as a disease of the organism (a natural level if there is one in biology), we find it in a specific organ and that it forms in a certain tissue of the organ, and of a given cell type (e.g. epithelial cells). If we continue even further we identify certain cellular behaviours or phenotypes that are disregulated in the malignant cells, these are controlled by regulatory pathway, whose activity are controlled by the expression of genes. When we follow the causal chain of the disease we hence end up at the level of the gene, which is commonly viewed as the causal agent in biology. (This view is again highly idealised and the status of the gene in biology is a hot topic most commonly framed in the gene-centric view of Dawkin's vs. the holistic view of Gould). In cancer biology this chain can never be perfectly traced from patient to gene, and instead a common strategy has become to study the differences between normal and malignant cells in terms of gene expression or mutations. The job is then to trace the consequences of these differences in terms of cellular behaviour and tumour growth dynamics, to tease out the changes that are relevant for the dynamics of the system from the ones that are neutral (c.f. driver vs. passenger mutations).
This is however far from trivial, and often results in a situation where we have massive amounts of data on the genetic level that we cannot interpret or understand. A more sensible approach would be to start at the other end of the causal chain and figure out which tissue and cellular processes are responsible for the malignant growth, and then identify which pathways regulate and control these processes. If we analyse the dynamics of these pathways we can identify the genes that drive their up- or down-regulation, and hence identify the genes we need to target. The reason why we have to land at the genetic of molecular level is because a cure for cancer has to act on this level in the form of radiation, cytotoxic drugs or small-molecule inhibitors. In other words, while the cure is molecular identifying the cause of the disease requires a top-down perspective.
To me this is exactly what mathematical modelling of cancer is about. Building models of systems at different levels - be it of tissues, single cells or pathways - and analysing the behaviour of the system with the aim of identifying ways in which the dynamics can be altered in a favourable direction. An example of this is the view that tissues within our bodies are best seen as ecosystems of interacting cell types, and consequently that cancer should be viewed as a failure within this multi-species context (for a recent review of this look at this paper by +David Basanta and +Alexander Anderson).
So to conclude this post: if we want to make progress in cancer research I believe that we should get our higher level facts right before we start looking for the causal agents at the lower level.