Problems and Technical Issues with Rosetta@home

Author	Message
Jim1348 Send message Joined: 19 Jan 06 Posts: 881 Credit: 52,257,545 RAC: 0	Message 98478 - Posted: 11 Aug 2020, 0:29:12 UTC - in response to Message 98477. Last modified: 11 Aug 2020, 1:07:29 UTC Folding is planning to offer a BOINC version. Probably not until after they finish their COVID-19 work, though. I thing "planning" is a bit ahead of where they are at the moment. I have been in those discussions for some time. They will probably look into it at some point. It could be done, but it takes some work. PS - By the way, they are beta testing a new CPU core a8. It is said to offer a 40 to 50% increase in output, and allow more advanced science. It is based on the latest GROMACS 2020. I don't know much beyond that, but want to check it out for a while. (I don't do the betas, but you can look on their forums at the beta section if you have a login). If you set the "advanced" flag in their Control app, you can get the first ones right after the beta finishes. ID: 98478 · Rating: 0 · rate: / Reply Quote

Corgi Send message Joined: 19 Jun 19 Posts: 6 Credit: 3,058,786 RAC: 0	Message 98479 - Posted: 11 Aug 2020, 4:52:33 UTC - in response to Message 98467. Quite a wonderful bunch of replies! I'm having a brain hiccup, though: How many cores do you have? ...remind me where I look to answer this, please? Computer's always on; I remember to take BOINC off pause about half the time before I go to bed [/sheepish]. I might also try that GPU-slot idea mentioned as well. ID: 98479 · Rating: 0 · rate: / Reply Quote

Brian Nixon Send message Joined: 12 Apr 20 Posts: 293 Credit: 8,432,366 RAC: 0	Message 98481 - Posted: 11 Aug 2020, 8:22:38 UTC - in response to Message 98479. How many cores do you have? ...remind me where I look to answer this, please? Computer details: Number of processors: 4 Running multiple BOINC projects you might need to review the Resource share setting for each one, and the associated client setting Switch between tasks, to make sure Rosetta tasks have sufficient chance to run. (I only run Rosetta, so others can advise better here.) ID: 98481 · Rating: 0 · rate: / Reply Quote

Mr P Hucker Send message Joined: 12 Aug 06 Posts: 1602 Credit: 13,011,009 RAC: 7	Message 98485 - Posted: 11 Aug 2020, 17:13:04 UTC - in response to Message 98479. Computer's always on; I remember to take BOINC off pause about half the time before I go to bed [/sheepish]. Do what I do, tell Boinc what the programs you run that need it to pause are, then it does it itself and always remembers to turn back on. If you're using Boinc Manager, it's in "Options", "Exclusive Applications". Then just add the path to the program in either the top box for stopping Boinc altogether (ignore what it says, it's wrong, it stops your GPU aswell), or the bottom box to only stop the GPU and leave the CPU running. If you're using Boinctasks, it's in "Extra", "Boinc Preference", "Exclusive Applications", making sure you have the correct computer selected. Then add the path to the program and tick "GPU only" if applicable, otherwise it pauses everything. ID: 98485 · Rating: 0 · rate: / Reply Quote

10esseetony Send message Joined: 24 Dec 11 Posts: 5 Credit: 24,116,987 RAC: 1,142	Message 98791 - Posted: 7 Sep 2020, 19:26:06 UTC Last modified: 7 Sep 2020, 19:26:38 UTC Problems and Technical Issues, eh? How about 41GB of RAM for ONE task? Name: ygG5REMC******1009391_1307_0 ID: 98791 · Rating: 0 · rate: / Reply Quote

Mr P Hucker Send message Joined: 12 Aug 06 Posts: 1602 Credit: 13,011,009 RAC: 7	Message 98793 - Posted: 7 Sep 2020, 19:35:23 UTC - in response to Message 98791. Last modified: 7 Sep 2020, 19:36:17 UTC Problems and Technical Issues, eh? How about 41GB of RAM for ONE task? Name: ygG5REMC******1009391_1307_0 Time to upgrade your computer ;-) Two of mine have 36GB RAM, but they'll take 128GB :-) By the way, your image requires an Anandtech login. ID: 98793 · Rating: 0 · rate: / Reply Quote

10esseetony Send message Joined: 24 Dec 11 Posts: 5 Credit: 24,116,987 RAC: 1,142	Message 98811 - Posted: 7 Sep 2020, 21:39:46 UTC - in response to Message 98793. Last modified: 7 Sep 2020, 21:41:40 UTC Thanks for letting me know the image can't be accessed, but clearly you are BOINC'ing around with the wrong team, if you don't have an AnandTech forum account. :P The pic is just a screenshot of the properties of the offending task. As for the RAM upgrade, I am already maxed out at 512GB. I am not quite ready to buy 8 sticks of 128GB just yet....not sure the MB can support it anyway. ( I am teasing....X570 MB with 64GB of RAM, Ryzen 3950X. So you can imagine my surprise at seeing tasks "waiting for memory," especially since I am only letting 8 Rosetta tasks run for the moment ) ID: 98811 · Rating: 0 · rate: / Reply Quote

gbaker Send message Joined: 10 Jul 09 Posts: 1 Credit: 65,119,218 RAC: 774	Message 98812 - Posted: 7 Sep 2020, 21:47:49 UTC - in response to Message 98791. I'm having a similar issue. Seems like one or two tasks keeps using as much memory as it can. It typically starts with normal behavior, then suddenly memory usage starts climbing fast at a linear rate of about 10GB per minute. If no memory limit is set for boinc, memory usage increases it hits the 32GB limit in my system (and quickly burns through swap). Then memory usage crashes back down to normal (about 12GB). If I set a memory limit, it would generally just suspend the task as "waiting for memory" (which leaves a core and a chunk of memory unavailable to other boinc tasks) Sometimes it will behave normally a few minutes and sometimes it immediately starts another cycle of fast increase in usage, then immediately falling back to normal once it fills memory. In my case, the problem workunits that have popped up while writing this are some variation on q1RftdTf_fold_and_dock. (This wasn't the only problem workunit, just the one that happened to be causing problems at this moment) Here are links to one of the tasks that seemed to be causing the problem (which I aborted) https://boinc.bakerlab.org/rosetta/result.php?resultid=1255773464 I'm currently running Rosetta on two computers, but only one of them seems to be running into this issue (at least at the current moment): https://boinc.bakerlab.org/rosetta/show_host_detail.php?hostid=5287905 The other computer seems to be running a non-overlapping set of tasks, so I don't know whether or not it would experience the problem given the same tasks. This isn't really a major issue for me, but I'm assuming it's not intended behavior. ID: 98812 · Rating: 0 · rate: / Reply Quote

Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1925 Credit: 18,534,891 RAC: 0	Message 98817 - Posted: 7 Sep 2020, 22:23:39 UTC - in response to Message 98791. Problems and Technical Issues, eh? How about 41GB of RAM for ONE task? Name: ygG5REMC******1009391_1307_0 Looks like another batch of dodgy Work Units. BTW- I'd suggest reducing the size of your cache, to 0. You only need to cache Tasks if there's a chance of running out of work before you next get new work. Being signed up to a dozen active projects, that will never occur, so no need for a cache. So no more missed deadlines- no point processing work if you're not going to get Credit for it. Your account, computing preferences, Other Store at least 0.01 days of work Store up to an additional 0.01 days of work Grant Darwin NT ID: 98817 · Rating: 0 · rate: / Reply Quote

Mr P Hucker Send message Joined: 12 Aug 06 Posts: 1602 Credit: 13,011,009 RAC: 7	Message 98820 - Posted: 7 Sep 2020, 22:40:02 UTC - in response to Message 98811. Thanks for letting me know the image can't be accessed, but clearly you are BOINC'ing around with the wrong team, if you don't have an AnandTech forum account. :P The pic is just a screenshot of the properties of the offending task. As for the RAM upgrade, I am already maxed out at 512GB. I am not quite ready to buy 8 sticks of 128GB just yet....not sure the MB can support it anyway. ( I am teasing....X570 MB with 64GB of RAM, Ryzen 3950X. So you can imagine my surprise at seeing tasks "waiting for memory," especially since I am only letting 8 Rosetta tasks run for the moment ) Even if I had an account, I doubt I'd be asked to log in for an inline image. Or does it stay logged in forever on your browser? ID: 98820 · Rating: 0 · rate: / Reply Quote

Mr P Hucker Send message Joined: 12 Aug 06 Posts: 1602 Credit: 13,011,009 RAC: 7	Message 98822 - Posted: 7 Sep 2020, 22:41:49 UTC - in response to Message 98817. Problems and Technical Issues, eh? How about 41GB of RAM for ONE task? Name: ygG5REMC******1009391_1307_0 Looks like another batch of dodgy Work Units. BTW- I'd suggest reducing the size of your cache, to 0. You only need to cache Tasks if there's a chance of running out of work before you next get new work. Being signed up to a dozen active projects, that will never occur, so no need for a cache. So no more missed deadlines- no point processing work if you're not going to get Credit for it. Your account, computing preferences, Other Store at least 0.01 days of work Store up to an additional 0.01 days of work Unless you run Milkyway on a GPU. Those have tasks that can take 30 seconds. And they refuse to fix the server (I've asked two successive project leaders and nothing gets fixed) - you cannot download new tasks if you're reporting completed tasks, so you need a big buffer (well 3 hours anyway). ID: 98822 · Rating: 0 · rate: / Reply Quote

robertmiles Send message Joined: 16 Jun 08 Posts: 1264 Credit: 14,421,737 RAC: 0	Message 98827 - Posted: 7 Sep 2020, 22:51:35 UTC - in response to Message 98820. Even if I had an account, I doubt I'd be asked to log in for an inline image. Or does it stay logged in forever on your browser? My Firefox browser will save logon information indefinitely as long as you use it every so often. ID: 98827 · Rating: 0 · rate: / Reply Quote

Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1925 Credit: 18,534,891 RAC: 0	Message 98829 - Posted: 7 Sep 2020, 22:58:30 UTC - in response to Message 98822. Unless you run Milkyway on a GPU. Those have tasks that can take 30 seconds. And they refuse to fix the server (I've asked two successive project leaders and nothing gets fixed) - you cannot download new tasks if you're reporting completed tasks, so you need a big buffer (well 3 hours anyway). If it were your only project, yes. If you're running more than one project, it's still not necessary even if one of the projects has issues with work allocation. Your other project will pick up work, and then BOINC will do extra for the first project when it can get work to balance out the debt between projects. Grant Darwin NT ID: 98829 · Rating: 0 · rate: / Reply Quote

Mr P Hucker Send message Joined: 12 Aug 06 Posts: 1602 Credit: 13,011,009 RAC: 7	Message 98835 - Posted: 7 Sep 2020, 23:49:30 UTC - in response to Message 98827. Even if I had an account, I doubt I'd be asked to log in for an inline image. Or does it stay logged in forever on your browser? My Firefox browser will save logon information indefinitely as long as you use it every so often. Yes, my Opera browser does that too, but all it does is fill in the password when you're asked for it. I don't think an inline image would pass the correct request through. ID: 98835 · Rating: 0 · rate: / Reply Quote

Mr P Hucker Send message Joined: 12 Aug 06 Posts: 1602 Credit: 13,011,009 RAC: 7	Message 98836 - Posted: 7 Sep 2020, 23:52:13 UTC - in response to Message 98829. Last modified: 7 Sep 2020, 23:52:40 UTC Unless you run Milkyway on a GPU. Those have tasks that can take 30 seconds. And they refuse to fix the server (I've asked two successive project leaders and nothing gets fixed) - you cannot download new tasks if you're reporting completed tasks, so you need a big buffer (well 3 hours anyway). If it were your only project, yes. If you're running more than one project, it's still not necessary even if one of the projects has issues with work allocation. Your other project will pick up work, and then BOINC will do extra for the first project when it can get work to balance out the debt between projects. I run more than Milkyway and I need the buffer. Otherwise Boinc only ever asks MW for a couple of 30 second tasks, as that's all it needs to fill the buffer. Then it hits the problem of not getting any more until it's backed off for 10 minutes. So even if I've said half Einstein, half MW, it ends up only managing to run MW a tenth of the time. ID: 98836 · Rating: 0 · rate: / Reply Quote

10esseetony Send message Joined: 24 Dec 11 Posts: 5 Credit: 24,116,987 RAC: 1,142	Message 98838 - Posted: 8 Sep 2020, 0:11:06 UTC - in response to Message 98820. Last modified: 8 Sep 2020, 0:18:59 UTC You are correct, no, one shouldn't have to log in to see the image, now that you mention it. I'll just link to the thread, but beware, have your adblocker turned on: https://forums.anandtech.com/threads/recent-changes-in-projects.2500471/post-40275238 My tasks that timed out were not due to an inability to complete them, it was forgetfulness that I had 'temporarily' suspended Rosetta on that machine. ///insert forehead slap emoji here/// I would caution against having zero cache as you suggest....I pay too much for my energy bill to have my machines idle for ANY length of time (internet outage/server outage/server upgrade/home router locked up/etc etc). Rosetta has run dry many times and I do not check my machines but once daily. ID: 98838 · Rating: 0 · rate: / Reply Quote

Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1925 Credit: 18,534,891 RAC: 0	Message 98839 - Posted: 8 Sep 2020, 0:23:21 UTC - in response to Message 98836. Otherwise Boinc only ever asks MW for a couple of 30 second tasks, as that's all it needs to fill the buffer. Then it hits the problem of not getting any more until it's backed off for 10 minutes. So even if I've said half Einstein, half MW, it ends up only managing to run MW a tenth of the time. Looks like it's been an issue forever. J Stateson built a BOINC client to work around Milkyway's stuffed up server configuration. Finally getting new tasks only seconds after running out. May not be worth the hassle. Grant Darwin NT ID: 98839 · Rating: 0 · rate: / Reply Quote

mikey Send message Joined: 5 Jan 06 Posts: 1898 Credit: 12,860,414 RAC: 25	Message 98840 - Posted: 8 Sep 2020, 0:29:32 UTC - in response to Message 98836. Peter Hucker wrote: I run more than Milkyway and I need the buffer. Otherwise Boinc only ever asks MW for a couple of 30 second tasks, as that's all it needs to fill the buffer. Then it hits the problem of not getting any more until it's backed off for 10 minutes. So even if I've said half Einstein, half MW, it ends up only managing to run MW a tenth of the time. MilkyWay needs us to run other projects tasks that run more than 10 minutes because that's the backoff the Project requires...NO communication with MW for 10 minutes before it will send new gpu tasks, personally I use PrimeGrid as they have short tasks and respect the zero resources share. I run 1 maybe 2 PG tasks and them MW refills the cache and I am off and crunching them again. If the gpu is not the fastest then Collatz will work as a zero resource share project too. IF you want to go outside the norm then a user made an alternative Boinc Manager at MilkyWay and it handles the 10 minute backoff so that it's not a problem, I don't know how but people that use it say it works. ID: 98840 · Rating: 0 · rate: / Reply Quote

Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1925 Credit: 18,534,891 RAC: 0	Message 98841 - Posted: 8 Sep 2020, 0:31:11 UTC - in response to Message 98838. I pay too much for my energy bill to have my machines idle for ANY length of time ? If they are idle, the power they'd be using (unless they're really, really old systems), would be bugger all. Rosetta has run dry many times and I do not check my machines but once daily. Rosetta might have run out, but you are also doing work for over a dozen other projects. I can't see all those projects running out of work at the same time- so you'll do a bit more work for those projects, then a bit extra for Rosetta when it has work again. Hence no need for a cache, let alone one more than a few hours or so. If you have crappy internet, what's the longest usual outage? Set the cache for that. Even so, with the short deadlines with Rosetta, anything larger than a couple of days when running that many projects will result in some missed deadlines as the systems workout how to meet their Resource share settings. Grant Darwin NT ID: 98841 · Rating: 0 · rate: / Reply Quote

hangint3n Send message Joined: 23 Mar 20 Posts: 8 Credit: 1,958,078 RAC: 0	Message 98845 - Posted: 8 Sep 2020, 0:55:15 UTC - in response to Message 98812. Just had a similar problem on my box. froze the whole thing up. === hangint3n ID: 98845 · Rating: 0 · rate: / Reply Quote