A thought about method

Message boards : Rosetta@home Science : A thought about method

To post messages, you must log in.

AuthorMessage
Profile SOAN
Avatar

Send message
Joined: 27 Sep 05
Posts: 252
Credit: 63,160
RAC: 0
Message 29845 - Posted: 23 Oct 2006, 4:04:06 UTC

Would it make sense to take the lowest energy predictions (say the lowest twenty) for a protein after the initial phase of prediction and run each of them as though it were a homolog WU?

Maybe this approach is already being used.
ID: 29845 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Charlie Abrams

Send message
Joined: 25 Jul 06
Posts: 13
Credit: 6,870,894
RAC: 0
Message 30000 - Posted: 25 Oct 2006, 20:05:13 UTC

I'd be interested in the RMSD between my computer's prediction and the overall lowest energy prediction - in other words, how close are all of the predictions to each other? As I understand it, two structures may be both exactly 1.0 RMSD from the correct structure but could be much more (or less) than 1.0 RMSD from each other.
ID: 30000 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Christoph

Send message
Joined: 10 Dec 05
Posts: 57
Credit: 1,512,386
RAC: 0
Message 30055 - Posted: 26 Oct 2006, 14:35:56 UTC

I'd be interested in the RMSD between my computer's prediction and the overall lowest energy prediction - in other words, how close are all of the predictions to each other?

The developers made a very cool feature, the 'results'
As I understand it, two structures may be both exactly 1.0 RMSD from the correct structure but could be much more (or less) than 1.0 RMSD from each other.

Right.
ID: 30055 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
David Baker
Volunteer moderator
Project administrator
Project developer
Project scientist

Send message
Joined: 17 Sep 05
Posts: 705
Credit: 559,847
RAC: 0
Message 30058 - Posted: 26 Oct 2006, 14:44:54 UTC - in response to Message 29845.  

Would it make sense to take the lowest energy predictions (say the lowest twenty) for a protein after the initial phase of prediction and run each of them as though it were a homolog WU?

Maybe this approach is already being used.



Yes, it does make sense, and this is one of the strategies we've been exploring.
ID: 30058 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Pendragon

Send message
Joined: 22 Jul 06
Posts: 2
Credit: 36,499
RAC: 0
Message 30943 - Posted: 11 Nov 2006, 11:00:01 UTC - in response to Message 30058.  

Has the use of a genetic algorithm (in the computer science sense) been considered? For those not familiar with this, it tries to evolve better solutions to a problem through "natural selection".

You start with a population pool of randomly generated solutions.

Members of the pool are randomly selected and bred to produce an offspring solution which inherits its characteristics at random from its parents. A small random mutation might also be added at this point.

All the offspring are evaluated for "fitness" against some predefined criterion.

The total population pool is then culled dropping those of lowest fitness.

Repeat breeding etc. The fitness of the pool should increase with time. When things seem to have stabilised take your fittest individual.

The above process can be repeated from scratch with a new random pool to see how good your previous solution was.

Rosetta could work well with this. The Rosetta main computers could keep track of the pool, and do the breeding. Offspring could then be sent out as starting positions for the normal process (which would do some random jiggling) and come back with an energy to be used for the fitness evaluation.

The breeding could be done by selecting a random point along the chain and using one parent's angles up to that point and the other thereafter. Some rearragement might be needed if this resulted in the molecule going through itself.

Tim
ID: 30943 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
James Thompson

Send message
Joined: 13 Oct 05
Posts: 46
Credit: 186,109
RAC: 0
Message 30963 - Posted: 11 Nov 2006, 20:15:31 UTC - in response to Message 30943.  

There are people in the group who have tried evolutionary methods similar to what you describe. As I understand it, one large problem comes when trying to implement crossover of two structures: sometimes an attempt at crossing over will result in atoms attempting to occupy the same space. As we get closer and closer to the real solution, each organism should be more tightly packed together, and these clashes will likely be more frequent. Without recombination between organisms genetic algorithms are probably not too much better than just random searches. I do know that people within our group have successfully used recombination of structures in the past, but currently recombination-style approaches are not part of our standard structure prediction protocol.

Here's another way to think about it: genetic algorithms are an approach to function optimization that works very well generally, as there are no explicit mathematical requirements for the function to be optimized. Rosetta's energy function has been designed to have some useful properties that can be exploited by more specific function optimization methods. One example of this is that the energy function is first-order differentiable, so that we can use gradient descent to find local minima in the neighborhood of the current trajectory.

Feel free to ask more questions if any of that doesn't make sense!

Has the use of a genetic algorithm (in the computer science sense) been considered? For those not familiar with this, it tries to evolve better solutions to a problem through "natural selection".

You start with a population pool of randomly generated solutions.

Members of the pool are randomly selected and bred to produce an offspring solution which inherits its characteristics at random from its parents. A small random mutation might also be added at this point.

All the offspring are evaluated for "fitness" against some predefined criterion.

The total population pool is then culled dropping those of lowest fitness.

Repeat breeding etc. The fitness of the pool should increase with time. When things seem to have stabilised take your fittest individual.

The above process can be repeated from scratch with a new random pool to see how good your previous solution was.

Rosetta could work well with this. The Rosetta main computers could keep track of the pool, and do the breeding. Offspring could then be sent out as starting positions for the normal process (which would do some random jiggling) and come back with an energy to be used for the fitness evaluation.

The breeding could be done by selecting a random point along the chain and using one parent's angles up to that point and the other thereafter. Some rearragement might be needed if this resulted in the molecule going through itself.

Tim


ID: 30963 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Pendragon

Send message
Joined: 22 Jul 06
Posts: 2
Credit: 36,499
RAC: 0
Message 30986 - Posted: 12 Nov 2006, 9:48:17 UTC - in response to Message 30963.  

There are people in the group who have tried evolutionary methods similar to what you describe. [snip} but currently recombination-style approaches are not part of our standard structure prediction protocol.

Here's another way to think about it: genetic algorithms are an approach to function optimization that works very well generally, as there are no explicit mathematical requirements for the function to be optimized. Rosetta's energy function has been designed to have some useful properties that can be exploited by more specific function optimization methods. One example of this is that the energy function is first-order differentiable, so that we can use gradient descent to find local minima in the neighborhood of the current trajectory.


Thanks for the prompt reply. I wasn't aware of the properties of the energy function. Exploiting them to find a local minimum is sensible. Part of the challenge is that there are many local minima. Finding the global one is the tricky bit. I was wondering if a genetic algorithm might help by allowing "good ideas" or folding certain bis to be combined overall by recombination. (On the other hand what works in isolation might not work so well when other bits of the molecule are nearby.)

Incidently is the molecule modelled in isolation, or are water molecules assumed to be around the place potentially forming H-bonds?

Tim
ID: 30986 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Feet1st
Avatar

Send message
Joined: 30 Dec 05
Posts: 1755
Credit: 4,690,520
RAC: 0
Message 30998 - Posted: 12 Nov 2006, 16:09:01 UTC

The project team has talked about this in passing a few times, so I'll take a stab at providing an answer.

Water is presumed (in the forumulas and scoring algorythms used) to basically encompass the protein, and the R@H searches try to fold in such a way that the hydrophobic portions (those that will repel from water) are folded into the core of the protein (thus insulating them from the water)... and that the hydrophilic portions (those that will attract with water) are on the outside (thus exposed to the water).

I don't believe they model exactly WHERE the water molecules will be, or how they will interact with the atoms that comprise the protein (and I've always wondered if this might be a design point that could improve precision, if water were more explicitly modeled).
Add this signature to your EMail:
Running Microsoft's "System Idle Process" will never help cure cancer, AIDS nor Alzheimer's. But running Rosetta@home just might!
https://boinc.bakerlab.org/rosetta/
ID: 30998 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile River~~
Avatar

Send message
Joined: 15 Dec 05
Posts: 761
Credit: 285,578
RAC: 0
Message 31032 - Posted: 13 Nov 2006, 6:18:39 UTC - in response to Message 30998.  
Last modified: 13 Nov 2006, 6:25:59 UTC

I don't believe they model exactly WHERE the water molecules will be, or how they will interact with the atoms that comprise the protein (and I've always wondered if this might be a design point that could improve precision, if water were more explicitly modeled).


I'd like to comment on this, and I'd remind folk I am coming from a physics background not biochem.

In the case of a single protein molecule the water will follow the molecule, rather than the other way round, so there would be no advantage in trying to model the exact placement of the water.

It is timely that this issue has been raised again now, as we are just about to be sent work that models a small protein whose molecules clump together. (see David's posting on this)

I'd imagine that the presence/absence of single water molecules, and their exact positions, might be very significant, like the way water molecules get built into some inorganic crystals (think CuSO4.5H2O for example). That would add a huge number of degrees of freedom to the modelling, and I am curious as to whether it will prove necessary.

River~~
ID: 31032 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote

Message boards : Rosetta@home Science : A thought about method



©2024 University of Washington
https://www.bakerlab.org