Are really low energy structures worth it if the RMSD isn't so good?

Author	Message
Otto Send message Joined: 6 Apr 07 Posts: 27 Credit: 3,567,665 RAC: 0	Message 47976 - Posted: 23 Oct 2007, 22:16:13 UTC What has always puzzled me is whether finding really low energy structures is worthwhile if the RMSD accompanying it isn't so spectacular. And the same could be asked conversely - is a very good RMSD worth it if the energy structure isn't? What kind of relationship is there between RMSD and energy structure in terms of relating to real-world proteins? Are the computed findings only useful for real-life action if BOTH the energy structure AND the RMSD are extremely low? (For example, the RMSD near-zero while the energy structure way, way below zero - of course depending on a given protein.) ID: 47976 · Rating: 0 · rate: / Reply Quote

Mod.Sense Volunteer moderator Send message Joined: 22 Aug 06 Posts: 4018 Credit: 0 RAC: 0	Message 47982 - Posted: 24 Oct 2007, 5:03:24 UTC Last modified: 24 Oct 2007, 5:06:52 UTC You need to picture it this way, you have no RSMD, because the protein has an unknown structure. In other words, you have to map because we're in the middle of an uncharted forest... now how do you find your way out?? And find a way out without running around in circles and expending needless energy climbing the mountainous terrain. The energy level is the only predictor you have to go by. It is kind of like telling you the GPS coordinate on the planet, but you don't have a map to tell you where that really is, or where the best route out of the forest might be. But if you keep following lower energy levels, you will find your way out... which is like saying if you keep following the water downhill, eventually you will find a direct line out of the forest. Perhaps not the fastest route out, but one of many ways that achieve the objective. And where the river exits the forest (even though you haven't seen a river yet) has proven in the past to a very good route to take. Saving much needless climbing of mountains. You see, looking at RSMD is "cheating". In the real application of Rosetta, you will not know what the RSMD is, because you will study proteins that have an unknown structure. It is like having a map to the forest. You only have that if you already know the answer to the protein's structure. And if I already have the structure, then there is no point in trying to discover it. The reason Rosetta studies these known structures is to test their approach to solving the problem. It tells the scientists, as the model progresses, how rapidly they are approaching the correct structure. And whether their new approach is producing a model that is better then there last approach produced. You might want to compare the "really low" energy levels of your models with those of others working on the same protein. Check out the graphs here. Rosetta Moderator: Mod.Sense ID: 47982 · Rating: 0 · rate: / Reply Quote

adrianxw Send message Joined: 18 Sep 05 Posts: 662 Credit: 12,140,580 RAC: 0	Message 47986 - Posted: 24 Oct 2007, 7:28:50 UTC I find myself again saying to be careful taking the analogy too far. In a forrest, on a planet, a river is likely to run to the coast. In a protein energy landscape, that is absolutely not certain, or even very likely. The protein landscape is full of dips and hollows. I realise that the planet anlogy is easy for people to grasp, but I have seen several threads now where people have got so far into the details of the analogy, that the details of the REAL problem have been totally lost. Wave upon wave of demented avengers march cheerfully out of obscurity into the dream. ID: 47986 · Rating: 0 · rate: / Reply Quote

Otto Send message Joined: 6 Apr 07 Posts: 27 Credit: 3,567,665 RAC: 0	Message 47991 - Posted: 24 Oct 2007, 10:14:22 UTC Ok, thanks for the answers. ID: 47991 · Rating: 0 · rate: / Reply Quote

buren Send message Joined: 18 Nov 07 Posts: 21 Credit: 132,158 RAC: 0	Message 48865 - Posted: 20 Nov 2007, 15:57:36 UTC - in response to Message 47991. Currently the models finish quite fast on modern PCs, about 1-2h each. Are the times only that short for the test runs or will they be that short with real untested structures as well? In that case it won't take that long to test a whole lot of proteines, making the project not really dependened on DC, so I guess it will take longer with the real ones. But why are the test runs that short? ID: 48865 · Rating: 0 · rate: / Reply Quote

Luuklag Send message Joined: 13 Sep 07 Posts: 262 Credit: 4,171 RAC: 0	Message 48866 - Posted: 20 Nov 2007, 17:18:13 UTC - in response to Message 48865. Currently the models finish quite fast on modern PCs, about 1-2h each. Are the times only that short for the test runs or will they be that short with real untested structures as well? In that case it won't take that long to test a whole lot of proteines, making the project not really dependened on DC, so I guess it will take longer with the real ones. But why are the test runs that short? the models aint finished, you just set the preference time that short, there are people who give 48 hours for a WU. ID: 48866 · Rating: 0 · rate: / Reply Quote

Mod.Sense Volunteer moderator Send message Joined: 22 Aug 06 Posts: 4018 Credit: 0 RAC: 0	Message 48869 - Posted: 20 Nov 2007, 17:37:05 UTC Rosetta tasks will run for as long as the runtime preference established in the Rosetta preferences... give or take the time of one complete model. Some combinations of protein and the method being used take over an hour per model. Others only take 5 or 10 minutes. The models for unknown protein structures take the same amount of time to complete. But it takes finding the best model out of 10,000 or even 100,000 to feel you are getting close to the correct prediction. So it takes all of us to come up with that one silver bullet. The problem with crunching unknown structures is... well, how do you know if your answer is correct? Or how close your new energy calculations have brought you to where you want to be? So, instead, you crunch structures that are known, but you don't peek at the answer while working on the models. The graphic shows the RSMD, which is a comparision to the known native structure, but that information was not used to guide the course the model will take. Rosetta Moderator: Mod.Sense ID: 48869 · Rating: 0 · rate: / Reply Quote

buren Send message Joined: 18 Nov 07 Posts: 21 Credit: 132,158 RAC: 0	Message 48953 - Posted: 22 Nov 2007, 17:15:51 UTC Okay, I read about being able to set the time per WU but couldn't find the preferences and so I thought it was a fixed time. What's prefered, more WU less detailed or less WU more detailed? I guess the longer the WU runs the better you know if the model works well, but probably there is a time after which you mostly know if the model is useful or not. So setting the time too high might waste resources. Most of my models did end with RMSD of around 10 after 2h, I don't know how exactly the RMSD is calculated and what RMSD are acceptable but at least the structures still looks way different to the eye from the real life structure. So have there been any improvements so far and how good must a model be i.e. how close to the real structure to be considered "it". ID: 48953 · Rating: 0 · rate: / Reply Quote

Keck_Komputers Send message Joined: 17 Sep 05 Posts: 211 Credit: 4,246,150 RAC: 0	Message 48971 - Posted: 23 Nov 2007, 8:28:49 UTC - in response to Message 48953. What's prefered, more WU less detailed or less WU more detailed? I guess the longer the WU runs the better you know if the model works well, but probably there is a time after which you mostly know if the model is useful or not. So setting the time too high might waste resources. The project would probably prefer a longer run time to reduce server traffic. The run time setting has no effect on the science. BOINC WIKI BOINCing since 2002/12/8 ID: 48971 · Rating: 0 · rate: / Reply Quote

juniper Send message Joined: 17 Aug 07 Posts: 4 Credit: 4,724 RAC: 0	Message 48977 - Posted: 23 Nov 2007, 20:27:44 UTC - in response to Message 48971. The run time setting has no effect on the science. Surely that can't be true? Otherwise why not run dozens of WUs for 5 minutes each, rather than 1 WU every 2 (or more) hours? Or is it the case that running a WU for a very short period of time will result in it being issued again to another cruncher until a certain total amount of time has been spent on a given WU? ID: 48977 · Rating: 0 · rate: / Reply Quote

Mod.Sense Volunteer moderator Send message Joined: 22 Aug 06 Posts: 4018 Credit: 0 RAC: 0	Message 48980 - Posted: 23 Nov 2007, 22:56:06 UTC juniper, you are confusing the term "models" and "WUs" (work units... now BOINC calls them "tasks"). When you select a runtime preference, you are setting a preferred runtime per task. That runtime is achieved by continuing to crunch models until we arrive as close to the preference as possible. All completed models are then reported back. So, if you run the default 3 hour runtime preference and complete 10 models, or if you run with a preference of 12 hours and you complete 40 models... the science done on each model is the same. Obviously the project would rather get back 40 models then 10, but on the other hand, the machine that only did 10 models still has 9 more hours left to do something with. So, you see, running longer doesn't change how each model is done. The science is the same. Rosetta Moderator: Mod.Sense ID: 48980 · Rating: 0 · rate: / Reply Quote

buren Send message Joined: 18 Nov 07 Posts: 21 Credit: 132,158 RAC: 0	Message 49039 - Posted: 25 Nov 2007, 13:07:25 UTC Last modified: 25 Nov 2007, 13:08:05 UTC Okay, with DC it probably really doesn't matter how many models a single computer runs for each structure/WU. I forgot that other computers can resume the same WU with different models. So the total number of tested models per WU stays the same. Because if one computer only tests 5 models per WU another computer might check the remaining 45. Does anyone know how much models per structure are actually tested? Or does it depend on the structure? ID: 49039 · Rating: 0 · rate: / Reply Quote

dcdc Send message Joined: 3 Nov 05 Posts: 1834 Credit: 124,260,318 RAC: 2	Message 49043 - Posted: 25 Nov 2007, 14:16:08 UTC it ranges from thousands to millions - looks like the record so far is 28 million: CNTRL_01ABRELAX_SAVE_ALL_OUT_-1ubi_-_filters 28,755,667 CNTRL_01ABRELAX_SAVE_ALL_OUT_-1di2_-_filters 6,936,746 CNTRL_01ABRELAX_SAVE_ALL_OUT_-1cc8A-_filters 5,193,077 CNTRL_01ABRELAX_SAVE_ALL_OUT_-1bq9A-_filters 4,338,378 ID: 49043 · Rating: 0 · rate: / Reply Quote

rochester new york Send message Joined: 2 Jul 06 Posts: 2842 Credit: 2,020,043 RAC: 0	Message 71774 - Posted: 9 Dec 2011, 23:22:32 UTC http://www.miketyka.com/2010/12/energy_landscapes/ ID: 71774 · Rating: 0 · rate: / Reply Quote