Bottleneck and Results

Author	Message
Dimitris Hatzopoulos Send message Joined: 5 Jan 06 Posts: 336 Credit: 80,939 RAC: 0	Message 9205 - Posted: 17 Jan 2006, 14:36:39 UTC Last modified: 17 Jan 2006, 15:12:18 UTC I'd just like to ask here again, the same question that was asked at F@H forum (original question): "I'm really pleased that thousands of people are donating computer time to the advancement of biological research. I had a couple questions about this project. What is the bottleneck (weakest link) with the science being done here? If more processing power is made available, will useful results be published more quickly? Or is the 'slowest' part of the research the researchers themselves? Ideally, we would want the limiting factor to be the researchers, so they're not waiting for results of our work. Basically, will more processing speed the science? And what's the upper limit on processing numbers that above which will not be useful (ie, at what level of processing power will the scientists be the limiting factor?). My second question is with regards to useful results. I measure useful results with citations - not the best measure, but certainly a measure. Is there a list of papers that have cited the work done by Rosetta and related distributed computing efforts? If volunteers would be willing to cite papers that cite these efforts, that would be great. Basically, I'm wondering if there's value to progressing research by getting more people on side." Best UFO Resources Wikipedia R@h How-To: Join Distributed Computing projects that benefit humanity ID: 9205 · Rating: 0 · rate: / Reply Quote

River~~ Send message Joined: 15 Dec 05 Posts: 761 Credit: 285,578 RAC: 0	Message 9207 - Posted: 17 Jan 2006, 15:10:03 UTC These are intelligent points. I'd like to pose the question another way: with current and projected staff levels, how many credits/day do you need us to do to make you productive enought for the project to be worthwhile; and how many credits/day would it take for us to get to the point where (at currently projected staff levels) you could not do any better with more crunch power? River~~ ID: 9207 · Rating: 0 · rate: / Reply Quote

River~~ Send message Joined: 15 Dec 05 Posts: 761 Credit: 285,578 RAC: 0	Message 9208 - Posted: 17 Jan 2006, 15:11:28 UTC - in response to Message 9207. These are intelligent points. I'd like to pose the question another way: with current and projected staff levels, how many credits/day do you need us to do to make you productive enought for the project to be worthwhile; and how many credits/day would it take for us to get to the point where (at currently projected staff levels) you could not do any better with more crunch power? In November 2005 I saw a posting on another project that said that Rosetta was (then) starved for cpu cycles. Is this still true? River~~ ID: 9208 · Rating: 0 · rate: / Reply Quote

Paul D. Buck Send message Joined: 17 Sep 05 Posts: 815 Credit: 1,812,737 RAC: 0	Message 9213 - Posted: 17 Jan 2006, 17:25:05 UTC There are papers available on the main site, and in the wiki on the bottom of the main page there is a link to the location, there are also all of the papers I have been able to track down related to the projects and the main BOINC application. For CPDN, there is growing information about the project and what it is trying to do ... I may be avoiding the point of the question, but, we ARE trying to gather links to help you answer these questions. As far as capacity, I am sure that for some time to come the limiting factor is still going to be the number of participants and the available computing power. Worse, I can't buy more computers for awhile :( WIth fewer computers it takes longer to make the data, this extends the time to do the research as many times the research cannot start until a lot of the data has been returned. I got this from the WCG forums digging into the the project information there ... Here at Rosetta@Home we are trying to find the best way to use the searchers. More searchers, the more ways we can try in shorter periods of time. The computational need is only going to grow if we start to see the actual UW research applications (I expect them ANY DAY NOW - just to try to get them motivated to start :)) ... so, we will be doing trials of search strategies, working on the program and also doing "real" science AT THE SAME TIME ... I do write as a non-project member. Though I will lay claim to a small amount of experience in this arena ... with this, now all the staffer has to say is "what Paul said..." ID: 9213 · Rating: 1 · rate: / Reply Quote

David Baker Volunteer moderator Project administrator Project developer Project scientist Send message Joined: 17 Sep 05 Posts: 705 Credit: 559,847 RAC: 0	Message 9255 - Posted: 18 Jan 2006, 7:40:46 UTC - in response to Message 9205. I'd just like to ask here again, the same question that was asked at F@H forum (original question): "I'm really pleased that thousands of people are donating computer time to the advancement of biological research. I had a couple questions about this project. What is the bottleneck (weakest link) with the science being done here? If more processing power is made available, will useful results be published more quickly? Or is the 'slowest' part of the research the researchers themselves? Ideally, we would want the limiting factor to be the researchers, so they're not waiting for results of our work. Basically, will more processing speed the science? And what's the upper limit on processing numbers that above which will not be useful (ie, at what level of processing power will the scientists be the limiting factor?). My second question is with regards to useful results. I measure useful results with citations - not the best measure, but certainly a measure. Is there a list of papers that have cited the work done by Rosetta and related distributed computing efforts? If volunteers would be willing to cite papers that cite these efforts, that would be great. Basically, I'm wondering if there's value to progressing research by getting more people on side." Excellent questions in this thread. thanks Paul for your on target responses. We are still limited by CPU time. I need to put together the report on the last several weeks of calculations for you, but basically what we are seeing is that for 4 of the 10 proteins we've been folding, the searches are working beautifully in that the lowest energy structures found are essentially identical to the true structure. For 4 of the remaining proteins, there is an unusual feature in the correct structure that is only sampled very rarely in the searches. Because there are far more structures generated without the feature, the lowest energy structures tend not to have the feature, and the predictions do not have the accuracy of the first set. But if we could sample more, so that we were generating large numbers of structures with the rare featur and thus exploring this part of the energy landscape more thoroughly, I think it is likely the lowest energy structures would be in this (correct) part of the landscape. That said, it is also likely that we could improve the searching procedure by making better use of the information generated by the searches already carried out, and so we are also limited by our imaginations, number of good ideas, etc. All the work units with the funny names you have been running are testing different ideas for how best to do the search, but as Paul predicted, no one method seems to be consistently better than all the others. So we are still searching for the magic bullet. It is possible of course, that there is no magic bullet, and the only solution is to search more (carry out still larger numbers of independent runs). so this brings us back to needing more cpu power! As far as research papers go, we are still in the early days, but there is already a paper in press in proceedings of the national academy that describes the very first results we got back in september-october. the people who found the lowest energy structures are acknowledged individually! I have no doubt that many important papers will follow in the time to come. And lastly, how many credits per day do we need? The answer, for better or for worse, is MORE! The current runs are all on proteins less than 85 amino acids, and searching is a problem. The size of the space that needs to be searched grows exponentially with length (number of amino acids), and many proteins are over 200 amino acids long--we do not have a reliable estimate of how long searching these spaces would take, and I would hesitate to even start at this until we are at current SETI levels or above. Also, we have yet to start our HIV vaccine design and other disease related work on BOINC -- when these calculations kick in we will be even more cpu limited. ID: 9255 · Rating: 0 · rate: / Reply Quote

scsimodo Send message Joined: 17 Sep 05 Posts: 93 Credit: 946,359 RAC: 0	Message 9256 - Posted: 18 Jan 2006, 8:49:37 UTC - in response to Message 9255. Last modified: 18 Jan 2006, 9:17:24 UTC And lastly, how many credits per day do we need? The answer, for better or for worse, is MORE! The current runs are all on proteins less than 85 amino acids, and searching is a problem. The size of the space that needs to be searched grows exponentially with length (number of amino acids), and many proteins are over 200 amino acids long--we do not have a reliable estimate of how long searching these spaces would take, and I would hesitate to even start at this until we are at current SETI levels or above. Also, we have yet to start our HIV vaccine design and other disease related work on BOINC -- when these calculations kick in we will be even more cpu limited. Would it be an option to compile a special Rosetta version with CPU optimizations (e.g. SSE2), put it somewhere on your HP and let people download and manually install it? Would speed up crunching a lot, I guess... ID: 9256 · Rating: 0 · rate: / Reply Quote

dcdc Send message Joined: 3 Nov 05 Posts: 1836 Credit: 124,981,563 RAC: 0	Message 9268 - Posted: 18 Jan 2006, 12:46:06 UTC - in response to Message 9256. Do the memory requirements increase exponentially with the protein size too? If so, I guess the limiting factor may become the CPU time available with enough memory available! I'm sure there's plenty of work to keep all the computers busy though :) Would it be an option to compile a special Rosetta version with CPU optimizations (e.g. SSE2), put it somewhere on your HP and let people download and manually install it? It would and it's on the to-do list I believe - there are quite a few threads dedicated to that and similar topics. ID: 9268 · Rating: 0 · rate: / Reply Quote

Paul D. Buck Send message Joined: 17 Sep 05 Posts: 815 Credit: 1,812,737 RAC: 0	Message 9272 - Posted: 18 Jan 2006, 14:06:28 UTC Just to amplify a little more. We have been doing searches on KNOWN structures. So, we can compare trials with actuals. And, as indicated we do well with some of the searches, worse in other cases. But, since this is a known territory we can see this. When doing new, unknown searches, things are a little more problematical. There we don't yet know what the "truth" is, so, the only way we know we have had success is when we have the lowest minina ... the problem is, as always, how do you know you are in the lowest and not just a local minima? If I walk out my front door I go downhill about 4 feet to the street. That is low ... but if I walk a half mile away or so I find a creek that is even lower ... but even that is 43 feet above sea level ... knowing geography, I know I am not in death valley and so I am nowhere close to the lowest point in the US ... or the Marianas trench ... We are searching in a space where Combinatorial Explosion kills you. (Wikipedia, or Combinarorial Explosion). I hope this helps ... ID: 9272 · Rating: 0 · rate: / Reply Quote

River~~ Send message Joined: 15 Dec 05 Posts: 761 Credit: 285,578 RAC: 0	Message 9275 - Posted: 18 Jan 2006, 14:30:52 UTC - in response to Message 9255. And lastly, how many credits per day do we need? The answer, for better or for worse, is MORE! ... we do not have a reliable estimate of how long searching these spaces would take, and I would hesitate to even start at this until we are at current SETI levels or above. That gives us a helpful ballpark estimate. Yesterday BOINCstats recorded about 30 million cobblestones crunched by all BOINC projects, of which 19 million were SETI and 1.7 million Rosetta. So take "current SETI levels" as 20 Megastones/day. In round figures then, Rosetta could usefully be about 11x - 12x its current size before it even makes sense to ask the question. That is useful to know. Or to put it another way, Rosetta can usefully recruit 2/3rds of the current BOINC donors - except that that would not be very friendly to those projects, all of which will be deat to somebody's heart. To avoid poaching then, Rosetta on its own is looking to recruit a 67% increase in the total number of BOINCers Not impossible, but certainly a challenge! And that only takes us to the point where we can expect a sensible answer to the 'how big' question, it is not the answer itself. R~~ ID: 9275 · Rating: 0 · rate: / Reply Quote

Dimitris Hatzopoulos Send message Joined: 5 Jan 06 Posts: 336 Credit: 80,939 RAC: 0	Message 9280 - Posted: 18 Jan 2006, 15:46:14 UTC Based on this feedback, I'd think it's important to offer optimised Rosetta clients (SSE/SSE2/SSE3 etc) ASAP. Btw, wasn't it an important BOINC point that the projects can now know PC specs, so they can make the best out of it e.g. send optimized exe's to newer CPUs or bigger WUs to PCs with bigger RAM/CPU etc, without involving the user? Also, I wonder what's the level of redundancy/duplication of work? Is every WU computed at least twice? I seem to remember Folding@Home reporting that their optimized code (GROMACS) cores run 2x - 3x faster. Also, all their cores will check for working SSE and will use it. Best UFO Resources Wikipedia R@h How-To: Join Distributed Computing projects that benefit humanity ID: 9280 · Rating: 0 · rate: / Reply Quote

Paul D. Buck Send message Joined: 17 Sep 05 Posts: 815 Credit: 1,812,737 RAC: 0	Message 9284 - Posted: 18 Jan 2006, 16:49:06 UTC Last modified: 18 Jan 2006, 16:50:11 UTC Rosetta@Home is not using redundant computing for the very reason that it would cut down the work done. The only gain is improved "fairness" of the credit grants. Even though there are those that are "inflating" their scores, for me, the improvement in the posture of the credit system is offset by the cost. So, let them have thier day in the sun ... :) Later, when "real" science is being done, we will likely be seeing the redundancy on THAT work at 3 or more. For the search search, we need to do as many searches as possible to see if we can figure out how to best search. Personally, I had been hoping that there would be a data crunch at SETI@Home to, ahem, force some of the fanatics to maybe consider that there are other things in life than runnning up a SETI@Home score. That would "free" up resources for other projects and we would ALL benefit. Heck even Dr. Anderson does not want people to put all their effort into SETI@Home because of the possibilities of server outages. I know my "take" on this is biased, and in some circles is *highly* unpopular, but, it is funny to see supposidly grown-up people whining that they only want to do SETI@Home and they cannot get work ... ignoring the fact that they have made the choice to run the risk of not having work ... sigh ... read the EULA ... Oh well ... I am guilty of 'starving" Rosetta@Home at the moment as I am still trying to get Einstein@Home over SETI@Home and now with LHC@Home with work, well, um, shucks ... :) ==== edit You can tell the redundency level on any project by looking at the Work Unit page and looking at the min quorum size number ... more details in the Wiki of course .... ID: 9284 · Rating: 0 · rate: / Reply Quote

Dimitris Hatzopoulos Send message Joined: 5 Jan 06 Posts: 336 Credit: 80,939 RAC: 0	Message 9287 - Posted: 18 Jan 2006, 17:29:43 UTC - in response to Message 9284. Personally, I had been hoping that there would be a data crunch at SETI@Home to, ahem, force some of the fanatics to maybe consider that there are other things in life than runnning up a SETI@Home score. That would "free" up resources for other projects and we would ALL benefit. Agreed. But in my experience you won't achieve much trying to change the minds of the hardcore "SETI or nothing" fanatics. It's better to try to expand the "pie", rather than compete for "market share", i.e. bring in new BOINC users. I think the most promising community would be Linux users, because most Unix/Linux computers run 24/7 anyway and the users themselves tend to be more tech-savvy (or willing to "experiment"). Also, running BOINC under Linux in a separate account is 100% secure. On the downside, the Unix boinc_cmd is very weak vs the Win app, even options like "suspend" etc won't work. And you won't get the cool graphics of desktop PCs. Personally, I've been doing some preliminary work on "BOINC advocacy", including the document in my sig. The problem is that BOINC setup/config is not as easy as it should be and if you add the project-specific bugs (Rosetta 1%, SIMAP's sigsegv faults etc) it's easy for the non-determined contributor to let it be (or go to F@H which is much easier to install/run). Best UFO Resources Wikipedia R@h How-To: Join Distributed Computing projects that benefit humanity ID: 9287 · Rating: 0 · rate: / Reply Quote

Dimitris Hatzopoulos Send message Joined: 5 Jan 06 Posts: 336 Credit: 80,939 RAC: 0	Message 9292 - Posted: 18 Jan 2006, 18:24:44 UTC Talking about getting CPUs from other projects, look at Predictor (which is the closest one to Rosetta afaik). I feel that Predictor@Home's negligence of their BOINC project URL is very irresponsible IMO. Considering that it's re-directed to a "domain-parking" page with ads for over a day, it's not unthinkable that someone might grab it, setup boinc and distribute some harmful trojan to the 50k PCs which have joined P@H... Can't imagine the consequences on the other BOINC projects. Personally I've suspended the project on all my computers. Best UFO Resources Wikipedia R@h How-To: Join Distributed Computing projects that benefit humanity ID: 9292 · Rating: 0 · rate: / Reply Quote

River~~ Send message Joined: 15 Dec 05 Posts: 761 Credit: 285,578 RAC: 0	Message 9302 - Posted: 18 Jan 2006, 19:39:17 UTC - in response to Message 9292. Last modified: 18 Jan 2006, 19:39:52 UTC ... I feel that Predictor@Home's negligence of their BOINC project URL is very irresponsible IMO. Considering that it's re-directed to a "domain-parking" page with ads for over a day, it's not unthinkable that someone might grab it, setup boinc and distribute some harmful trojan to the 50k PCs which have joined P@H... er, no. I agree it is clumsy to let a url renewal go unpaid (tho even microsoft have made this mistake at least twice, in 1999 and in 2003) At the same time, you can't just go and sign up a .edu url like you can a .com one. The odds of some other organisation entitled to register a .edu doing a trojan exploit is much less than if it was just some script kiddie buying the domain with a stolen credit card. R~~ ID: 9302 · Rating: 0 · rate: / Reply Quote

Dimitris Hatzopoulos Send message Joined: 5 Jan 06 Posts: 336 Credit: 80,939 RAC: 0	Message 9304 - Posted: 18 Jan 2006, 19:54:02 UTC Last modified: 18 Jan 2006, 20:00:24 UTC The main domain (scripps.edu) is fine, it's the predictor.scripps.edu URL which gets redirected! And as you can see, if you try several times in a row, it gets redirected to different "domain-park" and "advertisment" sites. EDIT, here is the HTTP protocol dialogue, someone is re-directing their traffic via HTTP 302: telnet predictor.scripps.edu 80 Trying 83.138.141.238... Connected to predictor.scripps.edu. Escape character is '^]'. GET / HTTP/1.1 Host: predictor.scripps.edu:80 HTTP/1.1 302 Found Date: Wed, 18 Jan 2006 19:56:01 GMT Server: Apache/2.0.54 (Unix) PHP/4.4.0 Accept-Ranges: bytes X-Powered-By: PHP/4.4.0 Set-Cookie: visitor=yes; expires=Wed, 18 Jan 2006 20:56:15 GMT Location: http://sedoparking.com/parking.php4?domain=predictor.scripps.edu:80 Content-Length: 0 Content-Type: text/html; charset=ISO-8859-1 Connection closed by foreign host. Best UFO Resources Wikipedia R@h How-To: Join Distributed Computing projects that benefit humanity ID: 9304 · Rating: 0 · rate: / Reply Quote

Dimitris Hatzopoulos Send message Joined: 5 Jan 06 Posts: 336 Credit: 80,939 RAC: 0	Message 9307 - Posted: 18 Jan 2006, 21:23:57 UTC - in response to Message 9304. Please ignore the prior message, it seems it is a DNS spoofing/poisoning issue somewhere down the line (normally IP for predictor.scripps.edu should be 137.131.252.96, not 83.138.141.238 shown below). The main domain (scripps.edu) is fine, it's the predictor.scripps.edu URL which gets redirected! And as you can see, if you try several times in a row, it gets redirected to different "domain-park" and "advertisment" sites. EDIT, here is the HTTP protocol dialogue, someone is re-directing their traffic via HTTP 302: telnet predictor.scripps.edu 80 Trying 83.138.141.238... Connected to predictor.scripps.edu. Escape character is '^]'. Best UFO Resources Wikipedia R@h How-To: Join Distributed Computing projects that benefit humanity ID: 9307 · Rating: 0 · rate: / Reply Quote

Cureseekers~Kristof Send message Joined: 5 Nov 05 Posts: 80 Credit: 689,603 RAC: 0	Message 9323 - Posted: 19 Jan 2006, 6:55:19 UTC Last modified: 19 Jan 2006, 7:04:18 UTC To David Baker: In your last message here, you said the CPU-power was the limit. So your researchers are waiting for us to crunch your workunits.... Why do we get regularly the message 'No work from project'? This message causes some frustrations by our members. Sometimes their client is just idling because of no send work units. Can't this issue be solved, or is it a hardware-issue? (I know you can buffer more work, if you change it in your preferences. But this isn't standard, so you have to change it yourself. Not everyone knows that. Maybe an idea to set the default setting on 1 day instead of 0,1 day?) Member of Dutch Power Cows ID: 9323 · Rating: 0 · rate: / Reply Quote

Deamiter Send message Joined: 9 Nov 05 Posts: 26 Credit: 3,793,650 RAC: 0	Message 9326 - Posted: 19 Jan 2006, 7:25:43 UTC - in response to Message 9284. Rosetta@Home is not using redundant computing for the very reason that it would cut down the work done. The only gain is improved "fairness" of the credit grants. Even though there are those that are "inflating" their scores, for me, the improvement in the posture of the credit system is offset by the cost. So, let them have thier day in the sun ... :) Later, when "real" science is being done, we will likely be seeing the redundancy on THAT work at 3 or more. For the search search, we need to do as many searches as possible to see if we can figure out how to best search. Why would they ever increase the quorum when doing so would clearly interfere with the point of your next sentence -- to run as many searches as possible? Yeah, there are credit issues, but for a project like this there is NO advantage to running the same WU twice! Push for flop counting, not redundant computing! With other projects, accuracy of each data point is much more important to the final results. With something like this that's driven by probability, the chances of a bad result being the best is so small as to be inconsequential! CERTAINLY not high enough to warrant cutting the computing power by two or even three times! I love my credits, and I can't seem to go a day without checking my stats, but I'd gladly give it all up if I knew it'd push usable results through faster! Like you (Paul) I'm pushing LHC@home in the near future, but I've got two dual-core 3.2GHz Pentiums and a dual-processor 3.4GHz workstation sitting in storage until the lab for my research is completed. I can't say that it'll double the throughput of Rosetta@home, but I plan to push Rosetta hard. I would sure hate to think that my efficiency was being cut by 3x so we can keep users with optimized apps from getting credit a bit faster! ID: 9326 · Rating: 0 · rate: / Reply Quote

Paul D. Buck Send message Joined: 17 Sep 05 Posts: 815 Credit: 1,812,737 RAC: 0	Message 9354 - Posted: 19 Jan 2006, 15:59:38 UTC - in response to Message 9323. To David Baker: In your last message here, you said the CPU-power was the limit. So your researchers are waiting for us to crunch your workunits.... Why do we get regularly the message 'No work from project'? As it turns out, there are two major bugs in the feeder. Together they can limit the effectiveness of the feeder. I do NOT have any internal data so I cannot tell you how bad the problem is for each and every project. However, the symptom for these two bugs will be "No work" and Rosetta@Home is not the only project to see this issue. It has been a LONG standing problem at Predictor@Home whose need for homogenous redundency makes the issue worse. The bottom line is that if the feeder is 100 slots in length, it is POSSIBLE that nearly half of them will be unusable for one reason or another. That is the bad news. The worse news is I proposed an initial fix a week ago, worse I wrote a new feeder with a different approach to "collisions" last night and sent it off to UCB ... this is worse because I am not a C programmer ... I don't have the ability to compile the system (yes I have Apple's X-code tools, but do not have them configured to do BOINC and have not been healthy enough to tackle THAT project, again, yet), and no way to test my changes. So, you can see my initial analysis and code suggestions on the developer's mailing list, a couple new questions/answers by Jeff Cobb of UCB and SOME of my answers. I believe that the concept that I propose will solve the two issues, the skipping of empty slots (my first patch) and the "orphan" result problem ... IF, the suggested feeder, or something close to my suggestion makes it live, I will probably write up the details of the concept in the Wiki to capture the design INTENT, something that I feel is lacking in the developer side of the documentation... but as I have repeatedly said, that is another issue ... :) So, if UCB proves out the feeder, it SHOULD help alievieate the "No work" problem on all projects ... ID: 9354 · Rating: 0 · rate: / Reply Quote

SwZ Send message Joined: 1 Jan 06 Posts: 37 Credit: 169,775 RAC: 0	Message 9365 - Posted: 19 Jan 2006, 16:53:57 UTC - in response to Message 9280. Also, I wonder what's the level of redundancy/duplication of work? Is every WU computed at least twice? It is allowably zero redundancy of work. Since trajectory make only single points in big statisticaly ensemble then some wrong data points can not distort all picture. We search a few good points and it is important maximally increase number of a different trajectories, so more practical no duplicate work for verification, but explore more trajectories. Then best candidates verified separatadly. ID: 9365 · Rating: 0 · rate: / Reply Quote