This is like using a sledgehammer to put up pictures. You simply do not need the kind of parallel compute grunt something like watson can provide to do correlation analysis as they describe. You could cobble this together in {dirty secret scripting language of choice} and run it on your laptop.

This is a PR piece - not the article, the activity - to bring forth the idea of computers making decisions about healthcare, based on metrics - not humans, based on metrics and compassion.

The cynic in me sees this as setting a disquieting precedent in the direction of healthcare being distributed according not only to patients prospects and treatment needs, but other factors which a financially liable party, say an insurer, would be interested in - such as the earning potential of the patient.

I'm pretty sure there was a Star Trek episode about this.

micro_cam

I've worked in the area of clinical genomics using whole genome sequencing. Your statement is unfortunately untrue though it perhaps could be if tools were better written.

While it is easy enough to analyze a single genome on your laptop most current popular analytical tools simply fall over when you start looking at hundreds of genomes on even a large server. Even basic stuff like combining multiple genomes into one file with consistent naming of variants can take entirely ridiculous multi terabyte amounts of ram because the tools to do so just weren't written with this scale in mind.

Most of these tools could (and should) be rewritten to do things without loading the whole data set in memory and work natively in a cluster of commodity clusters. There is some resistance to this of course because scientists prefer to use the published and established methods and often feel new methods need to be published and peer reviewed etc.

Until new tools are written and widely adopted to a large shared memory machine is a bandaid many hospitals and research seam eager to adopt.

dunk010

Yes indeed. And new tools are being written - see the Adam project for an interesting example: https://github.com/bigdatagenomics/adam and the associated variant caller Avocado: https://github.com/bigdatagenomics/avocado. Others are also trying to get the old tools working on Hadoop, for instance Halvade: https://github.com/ddcap/halvade/wiki/Halvade-Manual, Hadoop-BAM https://github.com/HadoopGenomics/Hadoop-BAM, SeqPig: http://seqpig.sourceforge.net/, and the guys at BioBankCloud: https://github.com/biobankcloud. It's going to take quite a while for this stuff to get fleshed out, and for researchers to adopt it. But the sheer weight of data is going to force things in the Hadoop direction eventually. It is inevitable.