Also Included In: IT / Internet / E-mail
Article Date: 09 Dec 2011 - 1:00 PST
email to a friend printer friendly opinions
Fifty years after the pioneering discovery that a protein's three-dimensional structure is determined solely by the sequence of its amino acids, an international team of researchers has taken a major step toward fulfilling the tantalizing promise: predicting the structure of a protein from its DNA alone.
The team at Harvard Medical School (HMS), Politecnico di Torino / Human Genetics Foundation Torino (HuGeF) and Memorial Sloan-Kettering Cancer Center in New York (MSKCC) has reported substantial progress toward solving a classical problem of molecular biology: the computational protein folding problem.
The results were published in the journal PLoS ONE.
In molecular biology and biomedical engineering, knowing the shape of protein molecules is key to understanding how they perform the work of life, the mechanisms of disease and drug design. Normally the shape of protein molecules is determined by expensive and complicated experiments, and for most proteins these experiments have not yet been done. Computing the shape from genetic information alone is possible in principle. But despite limited success for some smaller proteins, this challenge has remained essentially unsolved. The difficulty lies in the enormous complexity of the search space, an astronomically large number of possible shapes. Without any shortcuts, it would take a supercomputer many years to explore all possible shapes of even a small protein.
"Experimental structure determination has a hard time keeping up with the explosion in genetic sequence information," said Debora Marks, a mathematical biologist in the Department of Systems Biology at HMS, who worked closely with Lucy Colwell, a mathematician, who recently moved from Harvard to Cambridge University. They collaborated with physicists Riccardo Zecchina and Andrea Pagnani in Torino in a team effort initiated by Marks and computational biologist Chris Sander of the Computational Biology Program at MSKCC, who had earlier attempted a similar solution to the problem, when substantially fewer sequences were available.
"Collaboration was key," Sander said. "As with many important discoveries in science, no one could provide the answer in isolation."
The international team tested a bold premise: That evolution can provide a roadmap to how the protein folds. Their approach combined three key elements: evolutionary information accumulated for many millions of years; data from high-throughput genetic sequencing; and a key method from statistical physics, co-developed in the Torino group with Martin Weigt, who recently moved to the University of Paris.
Using the accumulated evolutionary information in the form of the sequences of thousands of proteins, grouped in protein families that are likely to have similar shapes, the team found a way to solve the problem: an algorithm to infer which parts of a protein interact to determine its shape. They used a principle from statistical physics called "maximum entropy" in a method that extracts information about microscopic interactions from measurement of system properties.
"The protein folding problem has been a huge combinatorial challenge for decades," said Zecchina, "but our statistical methods turned out to be surprisingly effective in extracting essential information from the evolutionary record."
With these internal protein interactions in hand, widely used molecular simulation software developed by Axel Brunger at Stanford University generated the atomic details of the protein shape. The team was for the first time able to compute remarkably accurate shapes from sequence information alone for a test set of 15 diverse proteins, with no protein size limit in sight, with unprecedented accuracy.
"Alone, none of the individual pieces are completely novel, but apparently nobody had put all of them together to predict 3D protein structure," Colwell said.
To test their method, the researchers initially focused on the Ras family of signaling proteins, which has been extensively studied because of its known link to cancer. The structure of several Ras-type proteins has already been solved experimentally, but the proteins in the family are larger--with about 160 amino acid residues--than any proteins modeled computationally from sequence alone.
"When we saw the first computationally folded Ras protein, we nearly went through the roof," Marks said. To the researchers' amazement, their model folded within about 3.5 angstroms of the known structure with all the structural elements in the right place. And there is no reason, the authors say, that the method couldn't work with even larger proteins.
The researchers caution that there are other limits, however: Experimental structures, when available, generally are more accurate in atomic detail. And, the method works only when researchers have genetic data for large protein families. But advances in DNA sequencing have yielded a torrent of such data that is forecast to continue growing exponentially in the foreseeable future.
The next step, the researchers say, is to predict the structures of unsolved proteins currently being investigated by structural biologists, before exploring the large uncharted territory of currently unknown protein structures.
"Synergy between computational prediction and experimental determination of structures is likely to yield increasingly valuable insight into the large universe of protein shapes that crucially determine their function and evolutionary dynamics," Sander said.
Article adapted by Medical News Today from original press release. Click 'references' tab above for source.Visit our genetics section for the latest news on this subject. This research was funded by the National Cancer Institute and the Engineering and Physical Sciences Research Council of the United Kingdom.
Written by R. Alan Leo.
Citation: PLoS ONE, December 6, 2011 "Protein 3D structure computed from evolutionary sequence variation," Marks et al.
Harvard Medical School Please use one of the following formats to cite this article in your essay, paper or report:
MLA
9 Dec. 2011.
Please note: If no author information is provided, the source is cited instead.
Please note that we publish your name, but we do not publish your email address. It is only used to let you know when your message is published. We do not use it for any other purpose. Please see our privacy policy for more information.
If you write about specific medications or operations, please do not name health care professionals by name.
All opinions are moderated before being included (to stop spam)
Contact Our News Editors
For any corrections of factual information, or to contact the editors please use our feedback form.
Please send any medical news or health news press releases to:
Note: Any medical information published on this website is not intended as a substitute for informed medical advice and you should not take any action before consulting with a health care professional. For more information, please read our terms and conditions.