Friday, February 19, 2010

more dubious cancer genomes

Update 15.6.2010:
A paper studying a lung cancer genome the way it should be done, using fresh samples from a live patient with known smoking record has appeared in Nature on 27.5., page 473. Of course, as the papers using cell lines got the scoop, this one wasn't hyped or reported in the press, so I managed to miss it and only discovered it when going back to the issue to look up something else. Goes to show that cutting corners pays off nicely in science. Still, lots more studies like this one are needed before one can arrive at definitive conclusions.


The current issue of Nature includes a news piece relating criticism of "cancer genome" sequences based on cell lines, which I have also voiced here. After some ifs and buts, the piece fails to come to a clear conclusion.

Personally, I wouldn't mind people sequencing the old cell lines if they think it helps them understand all the research that has been done on them in the past decades. What annoys me is the claim that this is "the cancer genome" and tells us something important about how cancer works. If the papers appeared in some middling journal for the record, that would be fine. But as articles in Nature, they are automatically overhyped in the press.

I think to find out something meaningful about real cancers, one would need to advance a bit more carefully and study proper cancer cells (the argument that in real tumours cancer cells are mixed with healthy cells doesn't stand up, as 1) it is now possible to sequence DNA on single molecule basis, thus from single cells, and 2) if they want lots of cancer cells, they just have to sort them out).

And of course, after all that huffing and puffing, the journal goes ahead and publishes another big "article" by the same people, also based on cell lines. I still think this is just cutting corners to be quicker than the people who do the research properly, and to get all the rewards that come with a "first" being published in Nature or Science. (And I am speaking as someone who has also been a co-author of an "article" in Nature in my time, and I'd also be happy to tell you what was wrong with that paper!)

2 comments:

Craig Mermel said...

Michael,

I enjoyed your blog post and understand your concerns regarding the validity/generalizability of genomic studies performed on cancer cell lines. As one of the co-first authors on one of the two copy number studies published in Nature this week, I thought I would share my perspective.

You are correct that the paper from our colleagues at the Sanger Center (doi:10.1038/nature08768) relied on copy number measurements obtained from ~750 cell lines, characterized at a very high resolution using the Affymetrix 6.0 SNP array. By contrast, our collection of over 3,000 cancer DNAs (doi:10.1038/nature08822) was largely derived from primary cancer samples -- over 2500 in total -- with the remaining ~600 samples obtained from cancer cell lines. In order to collect this many samples, we restricted our analysis to the slightly older and less dense Affymetrix 250K StyI array platform.

I am bringing this up not to taut our own study, but because our mixed dataset enabled us to systematically compare the genomic profiles of primary tumors and cancer cell lines. Like you, we initially shared the concern that including cancer cell lines would bias our results towards genetic regions artifactually gained/lost in cell lines and away from lesions that might not be well represented among cell lines cultured in plastic. However, when we compared the results of our analysis on all the samples with that of just the primary cancers, we found that the output was remarkably similar (> 90% concordance in peak regions identified; this data is displayed in Supplementary Figure 3B of our paper). While there are certainly alterations that are over-represented in cell lines vs. primary tumors (deletion of p16 is a good example), we did not find these regions to be systematically different enough to justify excluding these additional cancer samples from our final analysis. Indeed, we found the cancer cell lines to be very useful in narrowing the boundaries of regions of copy number gain/loss, owing to both their greater purity and greater number of alterations (and hence, a greater number of copy number breakpoints).

It it certainly true that cancer cell lines carry more individual genetic alterations than the typical primary tumor sample. While some of this diversity is likely generated in culture, some of it probably exists in the primary cancer sample, but is obscured due to both contamination of the tumor specimen by non-cancerous stromal cells (a real challenge to this kind of analysis) and, perhaps more importantly, sub-clonal heterogeneity within the cancer. This diversity is impossible to detect in analyses of primary tumor specimens, but is subsequently exposed during the sub-cloning process used to generate a cancer cell line.

While our very global analysis does not answer this question completely, it is our belief that cell lines do a fairly good job of representing the diversity of real driver events in human cancers and are an invaluable tool in cancer research because, unlike primary cells, they can be used for actual experimentation. Certainly, more research needs to be done to clarify the differences that exist between primary tumors and their cell line derivatives, and until such work is done one should always be cautious when interpreting genetic events that have not been validated in primary tumor samples. But the large datasets generated by our group, the Sanger Center, and other collaborators across the world represent the starting point for such work and, we believe, are therefore invaluable resources for the cancer research community in learning more about this dreadful disease.

Sorry for the long-winded response. To close, I do think your post raises valid concerns, and thank you for adding your clearly informed voice to this important conversation.

Warmest regards,

Craig Mermel
M.D./Ph.D. Student, Harvard Medical School
The Broad Institute of MIT and Harvard

Michael said...

Thanks Craig, that's very interesting. I hadn't actually looked closely at your paper before. I had read the first two Stratton papers in full and my main objection to them is that point mutations seen in cancer cell lines are being sold as representative of "the cancer genome". I would in fact be less worried about CNVs, and I like your approach of comparing CNVs from lots of different patients.