Message boards : Number crunching : Problems and Technical Issues with Rosetta@home
Previous · 1 . . . 75 · 76 · 77 · 78 · 79 · 80 · 81 . . . 300 · Next
Author | Message |
---|---|
robertmiles Send message Joined: 16 Jun 08 Posts: 1232 Credit: 14,269,631 RAC: 3,846 |
A task running for over 12 hours so far, even though I've selected a run length of 8 hours: I let it go overnight, It finally finished after 15.5 hours. |
Joe Send message Joined: 24 Nov 17 Posts: 1 Credit: 3,788,239 RAC: 815 |
I've been having this issue with my FreeBSD machine with BOINC installed on it always failing to compute jobs https://kitsunehosting.net/nextcloud/index.php/s/rysi6tY6TE33oZr/preview Now that I'm looking at it I'm pretty sure it never completed a job. Is there anything I should look into? Maybe some logs or something, find out what's failing and if I can fix it? Thanks so much for reading. |
Brian Nixon Send message Joined: 12 Apr 20 Posts: 293 Credit: 8,432,366 RAC: 0 |
Your machine does have some credit, so it’s obviously succeeded in running something at some point – just not recently. The Exec format error failures I assume are because the system is unable to run Rosetta’s Linux application. (Note it was trying to run the 32-bit application, which may not be appropriate for a 64-bit system. It’s also possible that older application versions were able to run, but recent updates have broken something. I don’t know enough about BSD’s Linux capability to be able to diagnose further. Rosetta@home does not provide a native BSD application. One user did report success running the 64-bit Rosetta Linux application on FreeBSD recently.) The others failed to download some of their input files. Could something be blocking downloads? |
Brian Nixon Send message Joined: 12 Apr 20 Posts: 293 Credit: 8,432,366 RAC: 0 |
Several of a new batch of horns5 tasks failing with access violations shortly after startup |
Jim1348 Send message Joined: 19 Jan 06 Posts: 881 Credit: 52,257,545 RAC: 0 |
Several of a new batch of horns5 tasks failing with access violations shortly after startup Maybe limited to Windows? I am running seven now (1 to 6 hours) on Ubuntu 18.04.5 (Ryzen 3900X) without a problem. PS - The sizes are quite reasonable, being less than 500 MB. That indicates they are not a new project, but a continuation of horns4 . It is interesting to speculate what that might be... |
robertmiles Send message Joined: 16 Jun 08 Posts: 1232 Credit: 14,269,631 RAC: 3,846 |
Several of a new batch of horns5 tasks failing with access violations shortly after startup I looked at the stderr log for several of your failed tasks. About two thirds of them failed while trying to access location 0, and I can't read the dump well enough to tell what instruction was trying to access that location. I'll have to leave the problem to someone who can read dumps better than I can. I did notice that you are using Windows 7, rather than the newer Windows 10. The only recent horns5 task I spotted for my Windows 10 computer completed and validated. Also, I noticed that all of your computers run BOINC 7.16.5; my computer runs 7.16.11. If no one else helps, you could try updating BOINC on one of your computers showing the problem, and Windows on another, to see if either of these older versions causes the problem. |
Brian Nixon Send message Joined: 12 Apr 20 Posts: 293 Credit: 8,432,366 RAC: 0 |
Also noticed the graphics app does not work (either disappears immediately or hangs) with those horns5 tasks that do manage to run |
Falconet Send message Joined: 9 Mar 09 Posts: 353 Credit: 1,222,776 RAC: 4,804 |
I've only got one horns5 but it's running fine under Ubuntu 18.04 at Google Colab. |
Brian Nixon Send message Joined: 12 Apr 20 Posts: 293 Credit: 8,432,366 RAC: 0 |
|
Bryn Mawr Send message Joined: 26 Dec 18 Posts: 389 Credit: 12,070,320 RAC: 12,300 |
No problem this end, two running at the moment and no failures. |
robertmiles Send message Joined: 16 Jun 08 Posts: 1232 Credit: 14,269,631 RAC: 3,846 |
I just had one horns5 fail on my computer after about 20 minutes while another is past that point and about half finished. The error message for the one that failed looks likely to mean a problem in an input file. |
robertmiles Send message Joined: 16 Jun 08 Posts: 1232 Credit: 14,269,631 RAC: 3,846 |
The strangest part is: some work units that have failed on my machines have succeeded elsewhere (example), and some that have failed elsewhere have succeeded on mine (example). This could mean that the application is picking up a random number from somewhere and using it as part of its input. If this in not deliberate, it could be the application program using the contents of some memory location without first setting it to a known value. |
robertmiles Send message Joined: 16 Jun 08 Posts: 1232 Credit: 14,269,631 RAC: 3,846 |
I just had one horns5 fail on my computer after about 20 minutes while another is past that point and about half finished. I now have three horns5 tasks running at once on my computer, all of them well past 20 minutes. |
Brian Nixon Send message Joined: 12 Apr 20 Posts: 293 Credit: 8,432,366 RAC: 0 |
I just had one horns5 failThat’s the same error Grant reported this morning. Interesting that those ones detected a problem and exited, while the others have just fallen over in a heap. Of course if, as you suggest, they’re using uninitialised data somewhere, anything could happen… |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2115 Credit: 41,115,753 RAC: 19,563 |
No problem this end, two running at the moment and no failures. Same |
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1671 Credit: 17,527,680 RAC: 23,122 |
At least this time around the with the horns5 Tasks i've had more Valid ones than errors. Last time it was easily 90% were errors. Grant Darwin NT |
tom Send message Joined: 29 Nov 08 Posts: 10 Credit: 6,044,733 RAC: 0 |
for 5 months i have been producing exactly one error-free task a day. it's hard to do more than that when there's a limit, wouldn't you think? as for why i'm supposedly "producing errors", i'm running the same software that the project provided, on the same computer that has run it for years. and just coincidentally, these "errors" only started with the switchover to secure http. not interested anymore. |
Brian Nixon Send message Joined: 12 Apr 20 Posts: 293 Credit: 8,432,366 RAC: 0 |
As we have tried to explain to you before: the limit is there to protect the project from hosts that fail to perform useful work, and if the problem were related to SSL you wouldn’t be able to download any tasks in the first place. It genuinely is coincidence that your trouble started around the same time as the switch. You’re not alone in finding that application version 4.20 doesn’t work on older versions of Mac OS, though the only resolution seems to be “try a different project”. |
[VENETO] boboviz Send message Joined: 1 Dec 05 Posts: 1993 Credit: 9,520,400 RAC: 11,365 |
You’re not alone in finding that application version 4.20 doesn’t work on older versions of Mac OS, though the only resolution seems to be “try a different project”. Or waiting a bugfixed version for Mac.... |
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1671 Credit: 17,527,680 RAC: 23,122 |
Plenty of WUs ready to go (11 million queued jobs), but all i get is No Tasks sent when requesting new work to replace returned work (Ready to send is zero). In progress has fallen from 550k down to 400k. Someone needs to give the servers a kick. Grant Darwin NT |
Message boards :
Number crunching :
Problems and Technical Issues with Rosetta@home
©2024 University of Washington
https://www.bakerlab.org