Message boards : Number crunching : Problems and Technical Issues with Rosetta@home
Previous · 1 . . . 45 · 46 · 47 · 48 · 49 · 50 · 51 . . . 300 · Next
Author | Message |
---|---|
Nicholas Hathaway Send message Joined: 20 Nov 14 Posts: 6 Credit: 791,395 RAC: 0 |
Hi I am repeatedly getting the following in my event loG: Sat Apr 18 07:39:31 2020 | Rosetta@home | Resetting project Sat Apr 18 23:48:22 2020 | Rosetta@home | Task hgfpsplit2_148_fold_SAVE_ALL_OUT_916496_89_0 exited with zero status but no 'finished' file Sat Apr 18 23:48:22 2020 | Rosetta@home | If this happens repeatedly you may need to reset the project. Sat Apr 18 23:53:46 2020 | Rosetta@home | Task hgfpsplit2_148_fold_SAVE_ALL_OUT_916496_89_0 exited with zero status but no 'finished' file Sat Apr 18 23:53:46 2020 | Rosetta@home | If this happens repeatedly you may need to reset the project. Sat Apr 18 23:54:40 2020 | Rosetta@home | Task hgfpsplit2_148_fold_SAVE_ALL_OUT_916496_89_0 exited with zero status but no 'finished' file Sat Apr 18 23:54:40 2020 | Rosetta@home | If this happens repeatedly you may need to reset the project. Sun Apr 19 00:46:30 2020 | Rosetta@home | Project requested delay of 7 seconds What do I need to do? |
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1670 Credit: 17,523,845 RAC: 23,480 |
Hi I am repeatedly getting the following in my event loG:How do your Computing preferences compare to these? Particularly the "When to suspend" settings. If they aren't a problem, then as it says in the Event log, you'll probably need to reset the project, but even then i may not fix the problem. It appears it's because the system is busy doing something else & BOINC can't communicate with the science application. Having "Use at most 100% of CPU time" less than 100% can cause it on some systems. As long as Tasks don't error out as a result, it's not a problem as such, but it does show some contention for resources on the system. You've also had a lot of Tasks miss the deadline, so a much smaller cache would be a good idea. Computing Usage limits Use at most 100% of the CPUs Use at most 100% of CPU time When to suspend Suspend when computer is on battery (not selected) Suspend when computer is in use (not selected) Suspend GPU computing when computer is in use (not selected) 'In use' means mouse/keyboard input in last 3 minutes Suspend when no mouse/keyboard input in last --- minutes Suspend when non-BOINC CPU usage is above --- % Compute only between --- Other Store at least 1 days of work Store up to an additional 0.02 days of work Switch between tasks every 60 minutes Request tasks to checkpoint at most every 60 seconds Disk Use no more than 20 GB Leave at least 2 GB free Use no more than 60 % of total Memory When computer is in use, use at most 95 % When computer is not in use, use at most 95 % Leave non-GPU tasks in memory while suspended (not selected) Page/swap file: use at most 75 % Grant Darwin NT |
Nicholas Hathaway Send message Joined: 20 Nov 14 Posts: 6 Credit: 791,395 RAC: 0 |
Computing preferences These settings apply to all computers using this account except computers where you have set preferences locally using the BOINC Manager Android devices Computing Usage limits Use at most 100 % of the CPUs Use at most 100 % of CPU time When to suspend Suspend when computer is on battery Suspend when computer is in use Suspend GPU computing when computer is in use 'In use' means mouse/keyboard input in last 3 minutes Suspend when no mouse/keyboard input in last --- minutes Suspend when non-BOINC CPU usage is above --- % Compute only between --- Other Store at least 0.5 days of work Store up to an additional 1 days of work Switch between tasks every 60 minutes Request tasks to checkpoint at most every 60 seconds Disk Use no more than --- GB Leave at least --- GB free Use no more than 90 % of total Memory When computer is in use, use at most 90 % When computer is not in use, use at most 90 % Leave non-GPU tasks in memory while suspended Page/swap file: use at most 75 % Network Usage limits Limit download rate to --- KB/second Limit upload rate to --- KB/second Limit usage to --- MB every --- days When to suspend Transfer files only between --- Other Skip data verification for image files Confirm before connecting to Internet Disconnect when done Edit preferences |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2115 Credit: 41,112,600 RAC: 19,835 |
Not sure if I should report this as a problem, but... Still reporting tasks and getting more 24hrs after 0 to send and 0 in progress. Am I one of the 7 who reported in the last 24hrs? We are deprecating the 'rosetta_for_devices' app. The arm platforms have been added to the 'rosetta' application group. We will also be deprecating the minirosetta app and will soon have just the rosetta app. There are still some minirosetta jobs in our queue. Oh, maybe this explains it. Thought it was weird. |
MarkJ Send message Joined: 28 Mar 20 Posts: 72 Credit: 25,238,680 RAC: 0 |
We are deprecating the 'rosetta_for_devices' app. The arm platforms have been added to the 'rosetta' application group. We will also be deprecating the minirosetta app and will soon have just the rosetta app. There are still some minirosetta jobs in our queue. Well that should reduce the development effort if you only have the one app (but multiple platforms). I still seem to be getting MiniRosetta and haven't cleared my cache (which is only 0.3 days) yet. BOINC blog |
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1670 Credit: 17,523,845 RAC: 23,480 |
I figure there will be a few resends after the last of the initial Tasks have been sent out. But with the short deadlines & low replication, it should all be cleared up well within a week.We are deprecating the 'rosetta_for_devices' app. The arm platforms have been added to the 'rosetta' application group. We will also be deprecating the minirosetta app and will soon have just the rosetta app. There are still some minirosetta jobs in our queue.Well that should reduce the development effort if you only have the one app (but multiple platforms). I still seem to be getting MiniRosetta and haven't cleared my cache (which is only 0.3 days) yet. Grant Darwin NT |
zfp Send message Joined: 22 Mar 20 Posts: 1 Credit: 114,637 RAC: 0 |
Hello, After a kernel update I restarted my system. It resulted in
then all running tasks at the time of the reboot to run for extremely long and all of them to exit with: Exit status 139 (0x0000008B) Unknown error code
|
ww Send message Joined: 17 Mar 20 Posts: 3 Credit: 455,936 RAC: 0 |
Maybe a memory leak rb_04_16_21806_21365_ab_t000__robetta_cstwt_5.0_FT_IGNORE_THE_REST_05_08_918009_366 https://boinc.bakerlab.org/rosetta/workunit.php?wuid=1037241400 The first attempt (Windows 32-bit) failed at 12 hours of CPU time, RSS 354MB I have the second attempt on Linux 64-bit. If it actually needs this much memory, 32-bit wouldn't have been able to run it at all. RSS was at 3.09 GB. (Or some swapped out. Don't post tired kids.) RSS has been steadily climbing; it started at 1.8 GB. Now at 3.5 hours. Completion is on pace for 11.9 hour run-time. It appears to be check-pointing . Application Rosetta 4.15 Name rb_04_16_21392_21290__t000__4_C1_SAVE_ALL_OUT_IGNORE_THE_REST_917949_249 State Running Received Sat 18 Apr 2020 04:59:33 PM EDT Report deadline Tue 21 Apr 2020 04:59:33 PM EDT Estimated computation size 80,000 GFLOPs CPU time 03:38:02 CPU time since checkpoint 00:02:38 Elapsed time 03:38:38 Estimated time remaining 04:56:41 Fraction done 30.282% Virtual memory size 3.09 GB Working set size 2.89 GB Directory slots/5 Process ID 24116 Progress rate 8.280% per hour Executable rosetta_4.15_x86_64-pc-linux-gnu |
Tom M Send message Joined: 20 Jun 17 Posts: 87 Credit: 14,880,624 RAC: 117,108 |
Hello, I'm a newbie to Rosetta and got things set up and running ok. In the last two days I've noticed my laptop running this app in an odd manner. Instead of running at 100% CPU, it fluctuates between 33% and 100%, If you have Seti@Home mostly idling I would go to the S@H website and disable the "intel igpu" check box. Generally running any crunching task on that part of the Intel cpu chip slows the entire system down significantly. This usually is true now, if/when Intel delivers on the planned upgrades to the iGPU it will then start behaving more like AMD's iGPU but not yet. Tom M Help, my tagline is missing..... Help, my tagline is......... Help, m........ Hel..... |
Maslo55 Send message Joined: 3 Mar 08 Posts: 1 Credit: 1,029,280 RAC: 0 |
I have some random crashes, once every few days, I find my crunching computer rebooted when I return to it. I also run Folding@home which I thought was responsible, but in Windows Event viewer the faulting application seems to be rosetta: Faulting application name: rosetta_4.15_windows_x86_64.exe, version: 0.0.0.0, time stamp: 0x5e856ed2 Faulting module name: unknown, version: 0.0.0.0, time stamp: 0x00000000 Exception code: 0xc0000005 Fault offset: 0x0000000000000000 Faulting process id: 0x2f68 Faulting application start time: 0x01d61c558f1205cc Faulting application path: C:ProgramDataBOINCprojectsboinc.bakerlab.org_rosettarosetta_4.15_windows_x86_64.exe Faulting module path: unknown Report Id: e3f4316a-3112-476b-9f13-f2fcc13a42e3 Faulting package full name: Faulting package-relative application ID: I have Ryzen 3600 with slightly overclocked RAM, would probably try default, or increasing voltage. All testing programs show no errors. I get some computation errors, but very infrequently. Rosetta seems to be a better RAM tester than Memtest for me. |
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1670 Credit: 17,523,845 RAC: 23,480 |
I have Ryzen 3600 with slightly overclocked RAM, would probably try default, or increasing voltage. All testing programs show no errors. I get some computation errors, but very infrequently.Or better yet default clocks & voltage to see if that sorts out the problem. Grant Darwin NT |
robertmiles Send message Joined: 16 Jun 08 Posts: 1232 Credit: 14,269,631 RAC: 3,846 |
I have some random crashes, once every few days, I find my crunching computer rebooted when I return to it. I also run Folding@home which I thought was responsible, but in Windows Event viewer the faulting application seems to be rosetta: [snip] I'm running Rosetta, some other BOINC projects, and Folding@home on my computer at the same time. Only BOINC currently get the GPU, since I'm trying to avoid changes in how loud the computer's fan is, and Folding@home doesn't always have GPU WUs available. This often causes crashes of my browser, but not also of Windows. It tends to make Rosetta tasks take about twice as much clock time to finish, though. I'm still trying to find out how many virtual CPU cores Folding@home uses at the same time, and how to control this - it appears that the slowdown is due to more background tasks trying to grab CPU time than there are virtual CPU cores to provide such CPU time. There needs to be a discussion somewhere of how to make BOINC and Folding@home share a computer; I haven't find one at Folding@home. Changing the Folding@home power setting to light helped reduce the crashes, but has not done much to the slowdown problem. |
Millenium Send message Joined: 20 Sep 05 Posts: 68 Credit: 184,283 RAC: 0 |
Are you running together CPU tasks on BOINC and Folding? If yes then it's nonsense. You just slow them all as they use more memory (and memory bandwidth, and CPU context changes and whatever) without need. Either use the CPU for Folding or for BOINC. There is no way to fix that, you can't just add threads that require CPU usage and expect them all to not be inefficient. GPU is a different thing of course. |
robertmiles Send message Joined: 16 Jun 08 Posts: 1232 Credit: 14,269,631 RAC: 3,846 |
Are you running together CPU tasks on BOINC and Folding? If yes then it's nonsense. You just slow them all as they use more memory (and memory bandwidth, and CPU context changes and whatever) without need. Either use the CPU for Folding or for BOINC. There is no way to fix that, you can't just add threads that require CPU usage and expect them all to not be inefficient. The Folding@home method for finishing the current WU and then stopping doesn't work, and I don't want to let a Folding@home workunit time out instead of finishing. I've already found a way to limit the number of threads BOINC is using. If I can find a similar method for Folding@home, I should be able to stop their contention for virtual cores, but let both continue to run CPU work. |
robertmiles Send message Joined: 16 Jun 08 Posts: 1232 Credit: 14,269,631 RAC: 3,846 |
Maslo55, Error Code 0xc0000005 under Window 7 or 10 indicates that a program failed to start. You should check if the total amount of memory in use is approaching the total amount of memory that your computer has. If so, the problem is not specific to Rosetta at home, but a problem with trying to run too many memory-demanding programs at once. You can either add more memory to your computer, or reduce the amount of work your computer is trying to do at once. |
strongboes Send message Joined: 3 Mar 20 Posts: 27 Credit: 5,394,270 RAC: 0 |
Folding@ you can set the number of cores in the cpu slot, change from - 1 to value you want. I have found there is no optimal for running both simultaneously. |
Brummit Send message Joined: 14 Jul 14 Posts: 2 Credit: 30,582 RAC: 0 |
Is there any way you could set up an option to download smaller work units? My average stats for Rosetta are - 159 completed, and 78 failed. Optimistically that's 1/3 of download work deemed invalid due to running out of time, requiring someone else has to (re) process the data, and pessimistically, just under half the data fails the deadline. I run the PC 12-15 hours per day average. A waste of processing time for all. My PC, though not the latest super duper 1000 core gamer extravaganza, is custom built two years ago, and still pretty good. Thankyou 'Brummit'. |
robertmiles Send message Joined: 16 Jun 08 Posts: 1232 Credit: 14,269,631 RAC: 3,846 |
Brummit, Under the advanced interface, Your account, Rosetta@home preferences, you can try reducing the Target CPU run time by about a third of its current value. But note that there's a minimum value you're not allowed to go below. This should give you workunits that run for shorter times, but need about the same amount of memory. Does this fit your idea of smaller work units? |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2115 Credit: 41,112,600 RAC: 19,835 |
Is there any way you could set up an option to download smaller work units? Have you only recently returned to this project? It looks like you've had a few days off after receiving tasks and had to abort some. If you're online 12-15hrs/day you should be able to complete 8hr tasks ok when they have a 3-day deadline. Try to let them run and complete and it should improve the more tasks you complete and return. It should settle down after a few days. Give it another try. |
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1670 Credit: 17,523,845 RAC: 23,480 |
Is there any way you could set up an option to download smaller work units?Just set a smaller cache- reducing the Target CPU Run time (at this point) there's the high risk that you'll just end up with even more Tasks downloading, missing the deadlines & then erroring out than is already happening. On your account page, Preferences, When and how BOINC uses your computer, Computing preferences Other Store at least 0.6 days of work Store up to an additional 0.02 days of workWorks for me. It takes an extremely long time for the Estimated completion times to get reasonably close to the actual time (Target CPU Run time). And even so, it will take a while for BOINC to determine how many hours a day your computer is on, and how much time it is able to process work while it is on (the default settings can mean just browsing sites with heavy graphics/scripts will stop BONC from processing work). Grant Darwin NT |
Message boards :
Number crunching :
Problems and Technical Issues with Rosetta@home
©2024 University of Washington
https://www.bakerlab.org