A Winding Path from Complex Analysis to Computational Biology

by Robert Thurman, Principal Computational Biologist,  

Seattle Genetics,

Bothell, WA

Thurman 300

Computational biologists come in two types: those who were originally trained mathematically or computationally and then gravitated towards biological problems, and those who were formally trained on the biological side but couldn’t stay away from computers. That generalization is slightly dated, because many colleges now offer interdisciplinary degree programs in computational biology and bioinformatics, but those programs tend to be small and it is safe to say most practioners currently in the field started out as one of the two types. Both perspectives are important.

I started with bachelor’s degrees in mathematics and computer science, and then took a break between my Master’s and PhD to work as a programmer for NASA’s Jet Propulsion Laboratory. I felt a call back to mathematics, and after following a traditional academic path from PhD to post-doc to tenure-track university teaching position, I switched gears again and took a programming position in the research division of a statistical software company. It was there I was exposed to machine learning techniques applied to biological problems, setting a course towards my current career in computational biology. There are many such winding paths into the field.

Recently I gave a presentation in Research Forum, which is a weekly opportunity at my company to share current results with the rest of the research community. We make targeted therapies for cancer, and I was hired to establish a devoted computational biology function. The forum was the first opportunity to try to neatly summarize what we do as computational biologists. It was a challenge. On the one hand most people in the audience had some exposure to our work, because we collaborate with every group in research. And some functions are well-established and well-known — we do a lot of genomics, for instance, trying to untangle which genes are regulated under treatment, or which ones might presage resistance to therapy. But the nature of our role is also highly varied. In some ways we are analytical “fixers,” and we are happy to take on any kind of problem related to data analysis. In trying to concisely categorize this type of work for my presentation, the best I could come up with was…”Math.”  It’s maybe a bit far from my PhD in complex analysis, but a definite path can be traced back to those roots. And there is a lot of math in this work, albeit in service to a specific (and valuable) purpose.

It’s truly an exciting time to be working in the field of computational biology, especially as it is applied to finding treatments for devastating, tough-to-treat diseases like cancer. Advances in biological understanding and experimental capabilities on the one side, and computational capacity and algorithmic sophistication on the other, have opened the way to new treatments and new tests to get the best therapies to the right patients. Breakthrough advances like immunotherapy have dramatically changed the prognosis for some patients. Advanced non-small-cell lung cancer (NSCLC), for example, has a terrible prognosis, with a 5 year survival rate near 0% for more advanced cases1. But so-called checkpoint inhibitors like nivolumab and atezolizumab, which target the cell surface proteins PD-1 and PD-L1 and free up the body’s immune system to attack cancer, have in some cases doubled overall survival rates compared to previous standards of care2. This level of improvement is virtually unheard of for new cancer therapies, and it means that the field now cautiously uses the word “cures” in cases it never could before. However, only a subset of patients respond to this type of therapy. So the race is on to 1) find biomarkers, that is, some measurable patient characteristics that can predict who is most likely to respond or not respond; and 2) find other immune checkpoints that are successfully druggable.

Computational biology and bioinformatics have prominent roles to play in both of these endeavors. The search for biomarkers involves sifting through data in which dozens to thousands of variables are collected on patients: from height, weight, age and gender, to the number, length and types of previous treatments, to genomic features like gene expression and mutations measured across potentially hundreds or thousands of genes. All of these patient characteristics are then compared to clinical results to see if any variable, alone or in combination with others, could be related to response. Because it is often the case in these types of problems that there are more variables than patients, modern machine learning techniques, such as regularization and random forests, can be used to overcome the limitations of under-determined systems and identify which variables are most important in predicting response. In my own work I use these techniques as well, to try to understand, for instance, what measurable characteristics of our drugs (which are fairly complicated in their mechanisms of action) contribute most to their potency in an in vitro setting. (This would be a good place to add, as a general recommendation to others as well, that I wish I had taken more statistics!)

Finding new immune checkpoints is a special case of the general problem of finding new drug “targets”. This usually means identifying a host molecule like a protein or gene product that is in some way important for the progression of a disease, and whose function can be altered or co-opted with a drug. Computational biology contributes in important ways to this as well. While a traditional approach to finding new targets might be to follow up on a research article that addresses some specific fundamental biology, modern data mining techniques can be applied to vast public data resources like the The Cancer Genome Atlas (TCGA)3 to scan the entire genome, across all cancers, for genes that are, say, preferentially expressed in cancer compared to normal tissue.

Such an exercise ties directly into one of the pleasures of the field — a lot of the data is public, and most of the tools are open source. So a new “experiment” for computational biology practitioners can be as easy as clicking a few links, downloading some data (making sure you have enough local storage space — the datasets can be huge), and writing some code. Speaking of which, another recommendation to those interested in the field is this: learn R. This open-source statistical package is an industry standard and my daily workhorse. Through its vast contributor network, R has seemingly a package to do everything, including providing an easy-to-use framework for making web apps for visualizing and sharing data.

So, what does mathematics (at least, the math I spent all that time studying for my PhD) have to do with my new career?  While I’m not proving theorems anymore, I would argue that my PhD experience provided important training for my work in a number of ways. A critical, analytical perspective is obviously important for both endeavors. Also, having a PhD background means mathematics is not a barrier to understanding new statistical techniques, and I can focus instead on the ideas. A love of learning, and a humility and curiosity about what you don’t know, are also crossover values. In my job, as in my PhD study, each day means another opportunity to learn, keeping things fresh and interesting. Finally, this is not a job for those who prefer to work alone. Creating new therapies is a complex, collaborative, multi-disciplinary endeavor, requiring clear communication with all the stakeholders. One of the joys of the position is to work with scientists who are not computationally or mathematically oriented and help translate their questions into concrete analytical problems. Teaching experience in academia has really helped in that regard, since it strengthened my skills of listening and explaining.

For those who love math, love programming, and love learning new things, computational biology is a great career option, and provides an opportunity to make a concrete difference in people’s lives.


1 American Cancer Society, https://www.cancer.org/cancer/non-small-cell-lung-cancer/detection-diagnosis-staging/survival-rates.html

2 “Further Evidence that Immunotherapy Provides a Longterm Survival Benefit for Lung Cancer Patients,” R&D online, 12 Apr 2018, https://www.rdmag.com/news/2018/04/further-evidence-immunotherapy-provides-longterm-survival-benefit-lung-cancer-patients

3 The Cancer Genome Atlas, https://cancergenome.nih.gov/

Mathematicians are Needed in Industry

Gregory_Coxson
Greg Coxson

At this point in my career, I have worked at a number of organizations, usually technology companies with military contracts. I am convinced that mathematicians strengthen organizations, and sometimes make revolutionary changes, often in small ways that are not celebrated as often as they should.

My first job was at the Center for Naval Analysis in Alexandria, Virginia. CNA performs long-range studies for the Navy, and is one of the oldest military Operations Research firms. I was working for a crusty old radar engineer, who wanted me to perform Monte Carlo analysis of Russian missile raids. This required thousands of runs of the program. One day, I needed to consult with a mathematician on another floor. I was surprised to find that he knew the simulation I was spending my days with. But what really amazed me was when we started discussing specifics. At that point, he pulled out a big binder containing tables of every possible combination of inputs to the model, and the associated outputs. He had invested the time one week to run all the possibilities and compile them. Having done this, he did not need to run the model for hours a day; instead he had just to pull out the binder and find the right row to pull out the results. It impressed me that this approach was much more efficient.

Later in my career, when I was working for another company, we had a large number of engineers working on a new ballistic missile system for the Navy. The schedules were aggressive, and the work multi-faceted and difficult. On one of the projects, it appeared necessary, despite the tight schedules, to spend a year running cases of flight trajectories. However, there was a PhD mathematician working on this, and he argued that since all the factors were known, mathematics could be used to perform a quick study, and come up with all the possible trajectories. He saved the company a year of effort and countless computer runs.

In these cases, Mathematics is not enough. It is important to get the information into the right hands. A junior engineer or mathematician will not be listened to, at least without concerted effort and the right arguments.

I had the opportunity to learn this first-hand. I was working on a critical program for the Air Force, and one evening before heading home, I was reading the specifications (not always easy reading). Before I went too far with this, I came upon something that stopped me in my tracks. Here, in a system where efficiency was highly emphasized, was an operation being done 80 times in one set, and then in the next set, the inverse of those operations was being done. This seemed to me something that should be fixed. So I went to my boss and pointed this out. However, I was new, and my boss did not know enough mathematics to understand my claim that eaeb = ea+b. No matter how I argued, she was not going to take my word for it. Her approach, ultimately, was to arrange a panel discussion with some scary senior analysts around the table to make me retract my story. But I did not back down. Looking back on it, the issue was a badly implemented discrete Fourier transform. I left the company soon after. It took about a month before I started getting phone calls asking for my notes. They had come around to agree with me.

The point of all this is, that mathematicians are needed outside of academia. Mathematics is used, and sometimes misused, every day in almost every industry. Mathematicians are needed for their training, but also their insights. I believe that mathematicians are able to find efficiencies, and new approaches, that others are blind to. Mathematicians are needed to prevent errors, to analyze complex problems and systems. There is no doubt in my mind that we need more mathematicians in industry.

Study Groups with Industry: Mathematics meets the real world

A study group is a type of workshop which brings together mathematicians and people from industry. The meetings typically last for 5 days, Monday-Friday. On the Monday morning the industry representatives present problems of current interest to an audience of applied mathematicians. Subsequently the mathematicians split into working groups to investigate the suggested topics. On the Friday solutions and results are presented to the industry representative. After the meeting a report is prepared for the company, detailing the progress made and usually with suggestions for further work or experiments. Over the years they have proved to be an excellent way of building bridges between universities and companies as well as providing exciting new topics for mathematicians. Of course there is pressure involved in attempting to understand and solve a problem over a short time frame. This can often produce an exciting and intense atmosphere but, in general, a good time is had by all.

 

Meyers_Study_groups.jpeg

Experiments can often help guide a mathematical investigation (or cause even more confusion)

The original Study Groups with Industry started in Oxford in 1968. The format proved a popular way for initiating interaction between universities and private industry. The interaction often led to further collaboration, student projects and new fields of research. Consequently, study groups were adopted in other countries, starting in Europe to form the European Study Groups with Industry (ESGI) and then spreading throughout the world, regular meetings are currently held in Australia, Canada, India, New Zealand, US, Russia and South Africa. A vast range of topics have been covered in the meetings, including beer and wine bottle labelling, legal sale of rhino horn, spontaneous combustion, mortgaging of cows, building toys, city bike sharing strategies, determining fish freshness, etc. New forms of meeting have also evolved, such as the Mathematics in Medicine or Agri-Food Study Groups.

The popularity of study groups can be attributed to their mutually beneficial effects. For companies there is:

  1. The possibility of a quick solution to their problem, or at least guidance on a way forward.
  2. Mathematicians can help identify and correctly formulate a problem for further study.
  3. Access to state-of-the-art techniques.
  4. Building contacts with top researchers in a given field.

The academics benefit from:

  1. Discovering new problems and research areas with practical applications.
  2. The possibility of further projects and collaboration with industry.
  3. The opportunity for future funding.

An important feature of these meetings is that they can also highlight the talents of students, leading to employment opportunities with the companies. In South Africa, after attending a number of study groups, a group of students took a new direction. Noting the gap in the market for applying mathematics to real world problems they started their own company, Isazi Consulting. Now they return to the meetings this time posing their own problems, and looking for new recruits.

Information on the European Study Groups can be found on the website of the European Consortium for Mathematics in Industry. A good source of information for meetings in Europe and the rest of the world is the Mathematics in Industry Information Service, see

ECMI Study Groups https://ecmiindmath.org/study-groups/

MIIS Website http://www.maths-in-industry.org/

 

Tim Myers

Centre de Recerca Matematica

Barcelona, Spain