Treating cancer is incredibly complex and there is no one-size-fits-all solution. But, there is something that can help physicians create treatments customized for individuals: big data.
There's an abundance of information being gathered in the health care sector including genome sequencing, tissue imaging, electronic health records and personal health trackers.
As director of cancer data science for the UCLA Jonsson Comprehensive Cancer Center, Paul Boutros and the other researchers in his laboratory are using big data to help optimize treatment for people, further advancing UCLA's reputation as a world leader in genomics-based cancer research.
In this interview, the internationally acclaimed cancer data scientist, who recently joined the David Geffen School of Medicine at UCLA as a professor of urology and human genetics, explains the impact of big data and how it will help improve scientists' understanding of cancer.
First, what exactly is big data?
Big data is really just a fancy sounding term for having a lot of information. And it could be information of almost any type, like images or genetic data. But the goal is to figure out how to use that data to allow patients, caregivers and clinicians to make better treatment decisions.
How do you use data to make predictions about what treatment would be best for an individual person?
We take multiple types of cancer data, often including DNA sequencing combined with clinical records, and try to analyze that to make clinically useful predictions. We try to inform the way patients are going to be treated, figuring out how to take all of our existing therapies, and all of our new therapies, to identify the best unique combination for each person. We try to find out what strategy will lead to the most benefit with the fewest side effects for each individual. So in a way, our focus is on the personalized optimization of treatment, rather than the development of new ones.
Sometimes in a tumor-type like prostate cancer, where I do a lot of my research, the question is actually whether treatment is even beneficial? Maybe extra therapy with its costs and potential side effects will not increase length or quality of life.
Milo Mitchell/UCLA
Paul Boutros
What does a data scientist do?
A data scientist is really somebody who does three things. First, we allow the data to teach us. We don't go in with a preconception of what's going to be there. We then develop new statistical tools and algorithms to try to analyze the data, and then ultimately, we work with teams, biologists and clinicians, to try to make sure that those discoveries are real. One of the amazing things about data science is that it's incredibly efficient. We're able to take data sets that have been analyzed once or twice and discover new things by coming at them from different angles. A beautiful example of this is that study looking at how cancers differ between men and women. We used about 8,000 tumors sequenced by different projects, and identified specific DNA mutations that are more frequent in male or female cancers. This begins to give us an understanding of why clinically there are sex-differences in response to specific therapies. And of course studies like this mean that each of the precious patient samples that we look at gets analyzed to its full complexity and we understand everything that we can about the disease.
What are some of the exciting research topics you are working on now?
One of the major projects we're looking at right now is how does a cancer differ if it arises in a man or in a woman. We've been mining thousands of cancer genomes analyzed by groups around the world, including here at UCLA, and using that to discover the subtle differences in how cancer grows depending on who it grows in. And that's giving us opportunities to identify new drugs and new ways of optimizing treatment for individuals.
We're studying how factors like ethnicity or gender, sex and lifestyle influence the way tumors develop, and that gives us a real opportunity to understand not just the cancer as it is when it's diagnosed, but how it grew in the context of an individual with a long lifespan and a lot of different factors that affect each of us before that happens.
Part of your research focuses on "crowdsourcing." What does that mean?
One of the most amazing opportunities of modern data science is crowdsourcing. We have a whole series of projects where we take big data sets and make them available on the internet so groups around the world can try to work with them. Not just scientists, but people at home can try to look at the data and see what they can find, and this can lead to all sorts of new discoveries.
What do you love most about working with data?
Every day, I come in to work and have the privilege of trying to explain new things that have never been understood about cancer before. And I get to work with incredibly talented and dedicated people who feel the same way. There are chemists and computer scientists, there are engineers and physicists, and that leads to a real diversity in the way that we think and a diversity that lets us find things that we could never find from a single viewpoint.
And because it involves dealing with all type of data, data science kind of lies at the heart of medicine and biology. UCLA is one of the best places in the world to do data science. We have brilliant trainees, fantastic cancer research, an incredible community of mathematicians and statisticians and computer scientists and the willingness from everybody here on campus to collaborate, and that makes all the difference.
Scientists tend to go through a lot more failures than successes. What motivates you to keep going?
I'm primarily motivated by two things: First, the incredible resourcefulness and passion of the trainees that are driving the research on a day-to-day basis, and who are just incredibly smart and talented. Second, by the urgent clinical need and the desire to figure out how to help patients have more success in their treatment. These two things get me through all of the challenges of research.
And in many cases, once you see those two things, you start to realize failure really isn't a failure. It's teaching me something about how I need to do a better job of mentoring students and post docs or it's teaching me something about why this certain type of approach is not going to be clinically useful. And you can always find that silver lining as long as you've got those bedrock principles of focusing on the patient and focusing on mentoring the trainees who are our next generation of scientists and clinicians.