Message boards : Number crunching : Problems and Technical Issues with Rosetta@home
Previous · 1 . . . 26 · 27 · 28 · 29 · 30 · 31 · 32 . . . 300 · Next
Author | Message |
---|---|
Sid Celery Send message Joined: 11 Feb 08 Posts: 2115 Credit: 41,110,095 RAC: 19,722 |
Bump. Hand to mouth over the last 24hrs. Though tbf this is pretty typical for January New tasks are a bit stopstart today. No tasks earlier today, then a batch of new ones came through, but empty again right now. |
Jim1348 Send message Joined: 19 Jan 06 Posts: 881 Credit: 52,257,545 RAC: 0 |
Now that you mention it, I have one core free too. A manual update did not get anything. It appears that the Christmas holiday has come late (or early) this year. |
premier Send message Joined: 30 Dec 05 Posts: 14 Credit: 23,872,868 RAC: 0 |
Same Here. 11 Machines are boring ATM. Not getting new job. |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2115 Credit: 41,110,095 RAC: 19,722 |
Bump. Hand to mouth over the last 24hrs. A fair few tasks became available today, but it seems our buffers are so empty they all got taken and we're back to zero again. Some efforts seem to have made, but more still needed |
Igor Kurpis Send message Joined: 13 Jan 20 Posts: 2 Credit: 103,514 RAC: 0 |
Hello! I was looking at my results recently I there is something odd in output of Rosetta task. Don't know if I should care or if there is a way to resolve this. <core_client_version>7.14.2</core_client_version> <![CDATA[ <stderr_txt> ter: 0x0000000005f08e93 *** *** Error in `../../projects/boinc.bakerlab.org_rosetta/rosetta_4.08_x86_64-pc-linux-gnu': free(): invalid pointer: 0x0000000005f08e93 *** *** Error in `../../projects/boinc.bakerlab.org_rosetta/rosetta_4.08_x86_64-pc-linux-gnu': free(): invalid pointer: 0x0000000005f08e93 *** *** Error in `../../projects/boinc.bakerlab.org_rosetta/rosetta_4.08_x86_64-pc-linux-gnu': free(): invalid pointer: 0x0000000005f08e93 *** *** Error in `../../projects/boinc.bakerlab.org_rosetta/rosetta_4.08_x86_64-pc-linux-gnu': free(): invalid pointer: 0x0000000005f08e93 *** *** Error in `../../projects/boinc.bakerlab.org_rosetta/rosetta_4.08_x86_64-pc-linux-gnu': free(): invalid pointer: 0x0000000005f08e93 *** [...] ** Error in `../../projects/boinc.bakerlab.org_rosetta/rosetta_4.08_x86_64-pc-linux-gnu': free(): invalid pointer: 0x0000000005f08e93 *** *** Error in `../../projects/boinc.bakerlab.org_rosetta/rosetta_4.08_x86_64-pc-linux-gnu': free(): invalid pointer: 0x0000000005f08e93 *** *** Error in `../../projects/boinc.bakerlab.org_rosetta/rosetta_4.08_x86_64-pc-linux-gnu': free(): invalid pointer: 0x0000000005f08e93 *** *** Error in `../../projects/boinc.bakerlab.org_rosetta/rosetta_4.08_x86_64-pc-linux-gnu': free(): invalid pointer: 0x0000000005f08e93 *** *** Error in `../../projects/boinc.bakerlab.org_rosetta/rosetta_4.08_x86_64-pc-linux-gnu': free(): invalid pointer: 0x0000000005f08e93 *** *** Error in `../../projects/boinc.bakerlab.org_rosetta/rosetta_4.08_x86_64-pc-linux-gnu': free(): invalid pointer: 0x0000000005f08e93 *** Starting watchdog... Watchdog active. Starting watchdog... Watchdog active. ====================================================== DONE :: 1 starting structures 28302 cpu seconds This process generated 41 decoys from 41 attempts ====================================================== BOINC :: WS_max 3.91049e+08 BOINC :: Watchdog shutting down... 03:06:19 (14253): called boinc_finish(0) </stderr_txt> ]]> 1118219840 Task itself completed succesfuly and has been validated, but still those errors doesn't look like something normal. In MiniRosetta there are no such errors.[/url] |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2115 Credit: 41,110,095 RAC: 19,722 |
Validators have been down for about 12 hours Anyone around to give bwsrv2 a kick? 93k tasks awaiting validation - 30 of them mine |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2115 Credit: 41,110,095 RAC: 19,722 |
Validators have been down for about 12 hours No change after 24hrs - bwsrv2 still down 185, 069 tasks awaiting validation |
Jim1348 Send message Joined: 19 Jan 06 Posts: 881 Credit: 52,257,545 RAC: 0 |
Thanks for reporting. I normally would not notice. I trust it is not a big deal, but maybe maintenance on a server or something. However, it helps the crunchers to have a Plan B in mind. |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2115 Credit: 41,110,095 RAC: 19,722 |
Thanks for reporting. I normally would not notice. I trust it is not a big deal, but maybe maintenance on a server or something. No shortage of tasks throughout, just awarding credit But all solved now and no tasks awaiting validation - all caught up, thanks |
robertmiles Send message Joined: 16 Jun 08 Posts: 1232 Credit: 14,269,631 RAC: 3,846 |
Is your download server having problems? My computer has been trying to download a rather small input file for many hours, and fails every time. 10v1nmgb_c724_10mer_gb_000434.zip It looks like it won't download any more tasks until after it gets this input file. |
robertmiles Send message Joined: 16 Jun 08 Posts: 1232 Credit: 14,269,631 RAC: 3,846 |
2/7/2020 12:22:52 PM | | Project communication failed: attempting access to reference site 2/7/2020 12:22:52 PM | Rosetta@home | Temporarily failed download of 10v1nmgb_c724_10mer_gb_000434.zip: transient HTTP error 2/7/2020 12:22:52 PM | Rosetta@home | Backing off 03:13:23 on download of 10v1nmgb_c724_10mer_gb_000434.zip 2/7/2020 12:22:54 PM | | Internet access OK - project servers may be temporarily down. |
Jim1348 Send message Joined: 19 Jan 06 Posts: 881 Credit: 52,257,545 RAC: 0 |
It looks like it won't download any more tasks until after it gets this input file. If it is holding up your machine, I think I would let the current tasks finish, detach, and try again. |
robertmiles Send message Joined: 16 Jun 08 Posts: 1232 Credit: 14,269,631 RAC: 3,846 |
It looks like it won't download any more tasks until after it gets this input file. How am I supposed to do that if the only current Rosetta@Home task won't finish downloading so that it can start? It's doing more for all the other BOINC projects I have selected that offer CPU tasks but no GPU tasks, though. |
Jim1348 Send message Joined: 19 Jan 06 Posts: 881 Credit: 52,257,545 RAC: 0 |
How am I supposed to do that if the only current Rosetta@Home task won't finish downloading so that it can start? You detach and end its misery. Sometimes a reboot works though. |
robertmiles Send message Joined: 16 Jun 08 Posts: 1232 Credit: 14,269,631 RAC: 3,846 |
How am I supposed to do that if the only current Rosetta@Home task won't finish downloading so that it can start? A restart followed by telling BOINC to retry the download finally helped. The file downloaded, and the task is now ready to start. Previously, telling BOINC to retry the download without the Windows restart didn't help. |
Jim1348 Send message Joined: 19 Jan 06 Posts: 881 Credit: 52,257,545 RAC: 0 |
I had the same problem with a stuck download, and a reboot fixed it for me too. But that practically never happens. So the fact that it is happening more often now indicates to me that their servers are overloaded. I will take a machine off. And if they want to tell us otherwise, I will listen. |
Mad_Max Send message Joined: 31 Dec 09 Posts: 209 Credit: 25,792,616 RAC: 13,701 |
If downloading retry does not help - aborting file transfer will usually work. Corresponding task will fail, but BOINC is smart enough to abort such tasks without trying to run it. So no any computation is wasted. P.S. I also have few stuck files in last few days (previous such case was about a year ago). I think one of the files was exactly the same file. And BOINC also stop getting new work from R@H until i have noticed it today and aborted stuck file transfer. One of tasks with "stuck" downloads: https://boinc.bakerlab.org/rosetta/result.php?resultid=1121514493 |
Jim1348 Send message Joined: 19 Jan 06 Posts: 881 Credit: 52,257,545 RAC: 0 |
I just had to abort one on my best machine, a Ryzen 3700x. A reboot did not fix it. Rosetta is beginning to lose some of its attraction for me. It was always a set-and-forget project. The errors were minor, and did not hang anything up. And explanation would be useful, as unlikely as that it. |
Mr P Hucker Send message Joined: 12 Aug 06 Posts: 1600 Credit: 11,716,372 RAC: 18,198 |
I'm getting lots of downloads (always a 3kB zip file) that stick (on all 4 computers). A temporary workaround seems to be to abort the task (not the download), then update the project so the project acknowledges you don't want that task that you can't get. It will then get others instead. But it's happening quite a lot. Unless I'm on holiday, I have a permanent monitor beside me showing what all my computers are doing on Boinc (using Boinctasks), but I'm sure many people won't check their machines that often. And if that download failed for me, will it fail for the next person it gives it to, and so on? Also I seem to have quite a high percentage of "error while computing" on all 4 machines (about a third of them). Is this normal or should I be trying to tweak something? I know with LHC@home an update to virtual machine fixed it. |
Jim1348 Send message Joined: 19 Jan 06 Posts: 881 Credit: 52,257,545 RAC: 0 |
And if that download failed for me, will it fail for the next person it gives it to, and so on? I am wondering whether it is related to the high memory requirements of some of the files recently. https://boinc.bakerlab.org/rosetta/forum_thread.php?id=13510 Probably they are two different things, but I will monitor the amount of available memory the next time I see one stuck. |
Message boards :
Number crunching :
Problems and Technical Issues with Rosetta@home
©2024 University of Washington
https://www.bakerlab.org