Message boards : Number crunching : Problems and Technical Issues with Rosetta@home
Previous · 1 . . . 125 · 126 · 127 · 128 · 129 · 130 · 131 . . . 300 · Next
Author | Message |
---|---|
Greg_BE Send message Joined: 30 May 06 Posts: 5691 Credit: 5,859,226 RAC: 0 |
Now its getting as complicated as what I am doing.What is difficult about it? Initially I left things alone and then credits got all out of whack, then I ran into issues with tasks taking up days on end and nothing else getting done by other projects and just a bunch of yoyo stuff going on. So I took back control. Again, everything was working fine until this bug showed up. And that just showed up. Maybe after updating to the latest BOINC. Anyway..I'll mess around with things until I find the right mix. No need to clog up this thread. |
Bryn Mawr Send message Joined: 26 Dec 18 Posts: 389 Credit: 12,070,320 RAC: 8,465 |
As Grant has said, the more you mess around with things the worse the situation will become. Set rec_half_life to 1, sit back and chill for a month and the system will follow your project shares. |
Greg_BE Send message Joined: 30 May 06 Posts: 5691 Credit: 5,859,226 RAC: 0 |
Just letting it ride now, no app_config. See where things go. I'll have to go back and look at your notes on the half life, I haven't done that yet. |
Greg_BE Send message Joined: 30 May 06 Posts: 5691 Credit: 5,859,226 RAC: 0 |
Bryn Mawr - added the half_life, will sit back and see what happens. Current WCG is dying in credits, guess I will have to pump that one up higher in % |
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1673 Credit: 17,593,339 RAC: 22,296 |
Bryn Mawr - added the half_life, will sit back and see what happens.Or just let things be until they have a chance to settle down- with 8 active projects, even with the changed half life value, i'd expect you're looking at a couple of weeks. One week bare minimum. Then adjust Resource share as necessary. Grant Darwin NT |
Greg_BE Send message Joined: 30 May 06 Posts: 5691 Credit: 5,859,226 RAC: 0 |
Bryn Mawr - added the half_life, will sit back and see what happens.Or just let things be until they have a chance to settle down- with 8 active projects, even with the changed half life value, i'd expect you're looking at a couple of weeks. One week bare minimum. Ok..will do. It's 5 active. I thought I had 2 GPU projects, but it seems just one at the moment. So its 3-4 CPU projects. |
Bryn Mawr Send message Joined: 26 Dec 18 Posts: 389 Credit: 12,070,320 RAC: 8,465 |
Bryn Mawr - added the half_life, will sit back and see what happens.Or just let things be until they have a chance to settle down- with 8 active projects, even with the changed half life value, i'd expect you're looking at a couple of weeks. One week bare minimum. I recently (6 weeks ago) added a 5th project (6 if you include Ralph which very rarely has work) because 3 of the projects were out of work / broken at the same time. One of my crunchers is now back to running smoothly whilst the other still has the occasional lump or bump as one project or another grabs a bit extra but is almost there. |
Greg_BE Send message Joined: 30 May 06 Posts: 5691 Credit: 5,859,226 RAC: 0 |
Bryn Mawr - added the half_life, will sit back and see what happens.Or just let things be until they have a chance to settle down- with 8 active projects, even with the changed half life value, i'd expect you're looking at a couple of weeks. One week bare minimum. I had more than a lump and a bump before I tried dividing up the computer. Like now, WCG is really really down close to dead and now that I opened things back up it still is down, but the results I checked are pending. So there is hope. |
Bryn Mawr Send message Joined: 26 Dec 18 Posts: 389 Credit: 12,070,320 RAC: 8,465 |
Bryn Mawr - added the half_life, will sit back and see what happens.Or just let things be until they have a chance to settle down- with 8 active projects, even with the changed half life value, i'd expect you're looking at a couple of weeks. One week bare minimum. That’s the project, not your machine. I’ve just had two days of low WCG credits and the shortfall turned up this morning - c’est la vie. |
Greg_BE Send message Joined: 30 May 06 Posts: 5691 Credit: 5,859,226 RAC: 0 |
Bryn Mawr - added the half_life, will sit back and see what happens.Or just let things be until they have a chance to settle down- with 8 active projects, even with the changed half life value, i'd expect you're looking at a couple of weeks. One week bare minimum. I gave it 200% and now its climbing like a jet plane. Just have to get LHC back up after WCG and then I think everything can go back to 100%. |
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1673 Credit: 17,593,339 RAC: 22,296 |
I gave it 200% and now its climbing like a jet plane. Just have to get LHC back up after WCG and then I think everything can go back to 100%.And then it will drop again. So you'll change it, and it will rise again. So you'll change it and it will fall again. So you'll change it, and it will rise again. So you'll change it and it will fall again. etc, etc. Most (if not all) of that rapid increase is not a result of your changes but for the reason Bryn posted- the Project had a delay in granting Credit, now it's all coming through. Hence the surge in Credit. RAC rises slowly, and falls quickly. The half_life change Bryn suggested should allow things to settle down sooner rather than later, but with the number of projects you have we're still talking weeks- not days. And as you change things, then change them back again, then change them, then change them again, it just keeps extending the time it will take for things to settle to actually meet whatever Resource share you finally leave things at for an extended period (ie over a few weeks). Grant Darwin NT |
Greg_BE Send message Joined: 30 May 06 Posts: 5691 Credit: 5,859,226 RAC: 0 |
I gave it 200% and now its climbing like a jet plane. Just have to get LHC back up after WCG and then I think everything can go back to 100%.And then it will drop again. Yeah I know it drops. So Just ramming it through to get up and later when I go back to work drop it. Half life was changed last week. |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2117 Credit: 41,139,251 RAC: 16,277 |
Project was down a little earlier, apparently to do a quick filesystem switch, but it got delayed and they didn't start it back up, so people would've seen Server error: feeder not running Project requested delay of 3600 seconds Quickly fixed after a nudge. Looks fine now. You didn't imagine it |
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1673 Credit: 17,593,339 RAC: 22,296 |
Quite a backlog of Validations now. Given that there is no longer any work for minirosetta, they could probably shut down all of the minirosetta processes, and make use of the freed up resources for a few more Rosetta Assimilators and Validators. From the Server Status page- rah_assimilator_rosetta1 (rosetta) rah_assimilator_rosetta2 (rosetta) rah_assimilator_rosetta3 (rosetta) rah_assimilator_rosetta4 (rosetta) rah_assimilator_rosetta5 (rosetta) rah_assimilator_mini1 (minirosetta) rah_assimilator_mini2 (minirosetta) rah_assimilator_mini3 (minirosetta) rah_assimilator_mini4 (minirosetta) rah_assimilator_mini5 (minirosetta) rah_validator_rosetta1 (rosetta) rah_validator_rosetta2 (rosetta) rah_validator_mini1 (minirosetta) rah_validator_mini2 (minirosetta) Grant Darwin NT |
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1673 Credit: 17,593,339 RAC: 22,296 |
Validation backlog appears to be growing- now over 104,000 The Server Status for the Validators might be showing green, but they don't appear to be actually doing anything at present. Grant Darwin NT |
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1673 Credit: 17,593,339 RAC: 22,296 |
Validation backlog appears to be growing- now over 104,000Now over 114,000. Yep- it's broken. Grant Darwin NT |
robertmiles Send message Joined: 16 Jun 08 Posts: 1232 Credit: 14,269,631 RAC: 2,588 |
A task running MUCH longer than the expected 8 hours: aaab_nNMALA_pp-SAR_pp-mPPS-BGLY_pp_2_2245795_6_1 https://boinc.bakerlab.org/rosetta/result.php?resultid=1441862159 2 days, 8 hours, 32 minutes so far rosetta python 1.03 vbox64 This is elapsed time, not the much shorter CPU time. |
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1673 Credit: 17,593,339 RAC: 22,296 |
Now over 138k.Validation backlog appears to be growing- now over 104,000Now over 114,000. Grant Darwin NT |
Bryn Mawr Send message Joined: 26 Dec 18 Posts: 389 Credit: 12,070,320 RAC: 8,465 |
Now over 138k.Validation backlog appears to be growing- now over 104,000Now over 114,000. And now over 176k but some must be getting through. Yesterday I dropped to 3k credits for the day as everything was pending but today I have 11k :-) |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2117 Credit: 41,139,251 RAC: 16,277 |
Now over 138k.Validation backlog appears to be growing- now over 104,000Now over 114,000. Now up to 237k backlog, but I don't have any pending dated 28th Oct so some are going through, just nowhere near enough to keep up, let alone catch up. I sent a message about 11hrs ago and got a reply about 8hrs ago that it'd be looked at when they got in, which I'm guessing would be ~6hrs ago. That it's not fully fixed yet indicates it's not as straightforward as the feeder issue a few days before. I've heard nothing more since. It's been reported and acknowledged. That's all I can say. PS: Apart from being away from home from yesterday until Sunday week apart from 1.5days, my email provider has had a major outage which looks like it'll take 2-3 days to fix, making matters worse. I will be able to check in here for 6 of 9 days I'm away and I am using a backup email account if anything new comes up - hopefully I won't have to When it rains it pours... Edit: When I started typing my credits were 300 less than what were showing here, so I did a manual update and my credits were 400 more than are showing here. Lots from 29th October updated, but in quite a funny order. Maybe things are moving much more rapidly right now? Fingers crossed |
Message boards :
Number crunching :
Problems and Technical Issues with Rosetta@home
©2024 University of Washington
https://www.bakerlab.org