Message boards : Number crunching : Problems and Technical Issues with Rosetta@home
Previous · 1 . . . 81 · 82 · 83 · 84 · 85 · 86 · 87 . . . 300 · Next
Author | Message |
---|---|
Kissagogo27 Send message Joined: 31 Mar 20 Posts: 86 Credit: 2,883,897 RAC: 2,627 |
<core_client_version>7.16.11</core_client_version> <![CDATA[ <stderr_txt> command: projects/boinc.bakerlab.org_rosetta/rosetta_4.20_windows_x86_64.exe -run:protocol jd2_scripting -parser:protocol dock_and_relax.xml @flags_21ffc515 -in:file:silent drhicks1_fd_21ffc515_egg_140_3229_348_1_000000036_0001_PJS-I-23D_xtl_ROSETTA_relax_super2_SAVE_ALL_OUT_IGNORE_THE_REST_1aa1aa1a.silent -in:file:silent_struct_type binary -silent_gz -mute all -silent_read_through_errors true -out:file:silent_struct_type score_jump -out:file:silent default.out -in:file:boinc_wu_zip drhicks1_fd_21ffc515_egg_140_3229_348_1_000000036_0001_PJS-I-23D_xtl_ROSETTA_relax_super2_SAVE_ALL_OUT_IGNORE_THE_REST_1aa1aa1a.zip -nstruct 10000 -cpu_run_time 28800 -boinc:max_nstruct 20000 -checkpoint_interval 120 -database minirosetta_database -in::file::zip minirosetta_database.zip -boinc::watchdog -boinc::cpu_run_timeout 36000 -run::rng mt19937 -constant_seed -jran 3919319 Using database: database_357d5d93529_n_methylminirosetta_database ERROR: Assertion `active( key )` failed. ERROR:: Exit from: C:cygwin64homeboinc4.17Rosettamainsourcesrcutility/keys/SmallKeyVector.hh line: 548 02:56:26 (948): called boinc_finish(0) </stderr_txt> ]]> from this WU https://boinc.bakerlab.org/rosetta/workunit.php?wuid=1198052234 |
Michael E.@ team Carl Sagan Send message Joined: 5 Apr 08 Posts: 16 Credit: 1,927,975 RAC: 506 |
I downloaded this work unit: https://boinc.bakerlab.org/rosetta/workunit.php?wuid=1202147167 It never begins processing. It stays in the "Ready to Start" state. Tasks from other projects process just fine. I have used BOINC for two+ decades but never saw this happen before. The work unit is Rosetta version 4.20, BOINC is at Version 7.16.11, and it is a Windows 10 system with a GPU. The Options > Computing Preferences are set at 50% of CPUs (6). There are no work units in the Transfers tab. Should I abort it and get some new work Rosetta units? Or abort and reset the Rosetta project? Anyone ever seen this? |
Bryn Mawr Send message Joined: 26 Dec 18 Posts: 389 Credit: 12,070,320 RAC: 12,300 |
I downloaded this work unit: https://boinc.bakerlab.org/rosetta/workunit.php?wuid=1202147167 Without seeing what, for example, Milky Way is doing on that machine at the same time it’s impossible to say. You need to look at the full picture, not just one project. As a example, if one of the other projects has had an off day and fallen behind on its resource share then it will suspend processing on Rosetta, leaving all WUs as Ready to Start, until the other project has caught up. |
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1671 Credit: 17,527,680 RAC: 23,122 |
I would suggest setting your cache to 0 as you are signed up to a dozen projects, almost half of them active.I downloaded this work unit: https://boinc.bakerlab.org/rosetta/workunit.php?wuid=1202147167 The smaller the cache, the sooner the system can meet your resource share settings- with that many projects i'd suggest you'd be looking at weeks. With even a small cache, it will take months, Preferences, When and how BOINC uses your computer Computing preferences, Computing, Other Store at least 0.00 days of work Store up to an additional 0.01 days of work I would also run the benchmarks on that system- it is showing the default values, and as they are used when it comes to allocating work (as well as allocating Credit for work done) it is probably impacting on what work is done & when. On the BOINC manager, Tools, Run CPU benchmarks. Grant Darwin NT |
Michael E.@ team Carl Sagan Send message Joined: 5 Apr 08 Posts: 16 Credit: 1,927,975 RAC: 506 |
Without seeing what, for example, Milky Way is doing on that machine at the same time it’s impossible to say. You need to look at the full picture, not just one project. Sorry for the incomplete info! No other CPU tasks are running from other projects. The other 3 projects on this PC do not allow new tasks (No New Tasks selected). Thanks for the questions! I would suggest setting your cache to 0 as you are signed up to a dozen projects, almost half of them active. No other active CPU tasks. Four total projects on this PC. CPU benchmarks result: 3/1/2021 11:04:04 AM | | Running CPU benchmarks 3/1/2021 11:04:05 AM | | Suspending computation - CPU benchmarks in progress 3/1/2021 11:04:36 AM | | Benchmark results: 3/1/2021 11:04:36 AM | | Number of CPUs: 3 3/1/2021 11:04:36 AM | | 4742 floating point MIPS (Whetstone) per CPU 3/1/2021 11:04:36 AM | | 13780 integer MIPS (Dhrystone) per CPU 3/1/2021 11:04:37 AM | | Resuming computation 3/1/2021 11:12:48 AM | | General prefs: from http://einstein.phys.uwm.edu/ (last modified ---) 3/1/2021 11:12:48 AM | | Computer location: home 3/1/2021 11:12:48 AM | | General prefs: using separate prefs for home 3/1/2021 11:12:48 AM | | Reading preferences override file 3/1/2021 11:12:48 AM | | Preferences: 3/1/2021 11:12:48 AM | | max memory usage when active: 2428.71 MB 3/1/2021 11:12:48 AM | | max memory usage when idle: 8095.70 MB 3/1/2021 11:12:48 AM | | max disk usage: 8.00 GB 3/1/2021 11:12:48 AM | | max CPUs used: 3 3/1/2021 11:12:48 AM | | suspend work if non-BOINC CPU load exceeds 35% 3/1/2021 11:12:48 AM | | (to change preferences, visit a project web site or select Preferences in the Manager) . . . Good suggestion to run the benchmarks. Yes, I use the Advanced View and I use Local Pref's. I removed Einstein a few days ago so not sure why it appeared in the benchmarks. I changed the cache for now but do not see why that matters. Cache was previously set for 1 day. I exited and restarted BOINC. I just enabled Rosetta to download new tasks and it downloaded 2 tasks. I will let them finish - all 3 are running. I had an issue with GPUGrid a few weeks ago and had to remove BOINC (and its ProgramData directory) completely. Not sure that is related. Anyhow, not sure why it is fixed (maybe reducing cache?) but it is working OK now. Thanks! |
mrhastyrib Send message Joined: 18 Feb 21 Posts: 90 Credit: 2,541,890 RAC: 0 |
Hello to all. I recently joined Rosetta@home with three computers. Things were fine until a few days ago. The wireless connection was interrupted over the weekend for one of the three, causing some of the tasks to time out for processing start (I am not sure if any of this history is relevant to the problem). I reconnected the wireless, and cleared out the task queue, and it filled up with new tasks. Since then, none of the new tasks will start. They all just sit at "Ready to start." Eventually, the new tasks abort for not starting by the deadline. I have been fiddling with the settings, and ran a CPU benchmark, nothing helps. I even deleted the program and reinstalled it. The other two computers continue to operate normally. All three computers are operating on Linux Mint. I tried to search for information about this problem; there is little that I could find other than it seems to be something that others encounter because of conflicts with other projects. I am on Rosetta@home only. Any guidance towards a solution would be appreciated. |
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1671 Credit: 17,527,680 RAC: 23,122 |
Any guidance towards a solution would be appreciated.Are you using Web based preferences, or settings in the BOINC Manager? If the Web based settings, in the Manager menu, select Options, Computing preferences and make sure it shows that the Web based preferences are being used. If local, make sure a value hasn't been set that stops BOINC from running. With your computing preferences, what "Usage limits" & "When to suspend" values do you have? Ideally- Usage limits Use at most 100 % of the CPUs Use at most 100 % of CPU time When to suspend Suspend when computer is on battery Suspend when computer is in use Suspend GPU computing when computer is in use 'In use' means mouse/keyboard input in last 3 minutes Suspend when no mouse/keyboard input in last --- minutes Suspend when non-BOINC CPU usage is above --- % Compute only between ---If it's set to suspend at any time, check to see that there is nothing going on, on that system, that meets any of those settings values- eg some system or other process using CPU time, stopping the Tasks from starting. Check that something isn't hogging system RAM, and hitting the limits that stop BOINC from processing work. In the BOINC Manager, you can select one of the Tasks ready to start, Suspend it, then Resume it a few seconds later & see if that kick starts things. And even with just Rosetta as your only project, with the very short deadlines no cache (or an extremely small one) eg 0.1 + 0.01 is the best way to go. Grant Darwin NT |
mrhastyrib Send message Joined: 18 Feb 21 Posts: 90 Credit: 2,541,890 RAC: 0 |
Thank you for your detailed reply. Are you using Web based preferences, or settings in the BOINC Manager? The settings in the manager. I assume that is "local preferences." With your computing preferences, what "Usage limits" & "When to suspend" values do you have? Usage limits Use at most 100 % of the CPUs <<-- This Use at most 100 % of CPU time <<-- This When to suspend Suspend when computer is on battery <--This is checked Suspend when computer is in use <--not checked Suspend GPU computing when computer is in use <--not checked 'In use' means mouse/keyboard input in last 3 minutes this would be the value if checked Suspend when no mouse/keyboard input in last --- minutes this line does not appear in my options Suspend when non-BOINC CPU usage is above --- % would be 25% but not checked Compute only between --- this line does not appear in my options If it's set to suspend at any time, check to see that there is nothing going on, on that system, that meets any of those settings values- eg some system or other process using CPU time, stopping the Tasks from starting. I unplugged/replugged the power cable and verified that it suspends when on battery. This laptop is the one that I use exclusively for Zoom calls. I have nothing running on it when not using Zoom, and I halt BOINC when making a call (about once per week). I've also gone into the system monitor and killed things like Nemo, Mint Update, bluetooth, etc to maximize available memory. Check that something isn't hogging system RAM, and hitting the limits that stop BOINC from processing work. System monitor currently shows 1.2gb of memory used out of 3.3gb available. The "top" utility in the terminal shows 3336gb total; 195gb free; 805gb used; 2334gb cache/buffer In the BOINC Manager, you can select one of the Tasks ready to start, Suspend it, then Resume it a few seconds later & see if that kick starts things. I've tried this; also tried updating; suspending all tasks and updating and restarting; everything that I can think of. And even with just Rosetta as your only project, with the very short deadlines no cache (or an extremely small one) eg 0.1 + 0.01 is the best way to go. I read this in one of your responses to someone else and have tried this, but it does not affect the problem. I also went into web preferences and changed everything to match this laptop, but it doesn't work. One other thing: all three of my computers are running different revisions of BOINC (is that normal?). The version for the affected computer is 7.16.6. I found a reference on the Internet (and the Internet is never wrong [snurk]) that said that this revision is unstable. I don't know if that is currently true. Is there a way to use a different revision and see if that helps? Thank you for your efforts in trying to help. Any idea where I can go from here? What perplexes me is that it once worked, but now does not, even though I didn't change anything in the settings, or even use the computer at all prior to the onset. Yet deleting and then reinstalling did not cure it. Restarting the computer does not help. |
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1671 Credit: 17,527,680 RAC: 23,122 |
Thank you for your efforts in trying to help. Any idea where I can go from here? What perplexes me is that it once worked, but now does not, even though I didn't change anything in the settings, or even use the computer at all prior to the onset. Yet deleting and then reinstalling did not cure it. Restarting the computer does not help.From the sounds of things, there's no obvious reason for them not to be running. Grasping at straws here- set it to use the Web based preferences. Once that is done, you need to click Update on the Manager to make sure it has the current Web based settings. If it then works, then go back to the Manager based settings & see if it keeps working. If it's still not starting- in the Manager, Options, Event log options, select cpu_sched_debug, cpu_sched, and cpu_sched_status and save them. Exit BOINC (give it a few seconds as it can take a while to fully exit), then restart and once it's running again, go to Tools, Event log & see what's there. With luck, one of those flags will either show why (or give an indication of why) it's not starting any of the Tasks. If there's nothing that gives us any sort of hint as to what's going on, then your best option would be to post about the issue in the BOINC forum. There's a good chance someone there will have an idea of what is going on with that system. Grant Darwin NT |
mrhastyrib Send message Joined: 18 Feb 21 Posts: 90 Credit: 2,541,890 RAC: 0 |
Grasping at straws here- set it to use the Web based preferences. Once that is done, you need to click Update on the Manager to make sure it has the current Web based settings. If it then works, then go back to the Manager based settings & see if it keeps working. It's running again. For the sake of some other poor soul who runs into a similar problem, I will document the sequence of events leading to the breakthrough. * Changed to web-based preferences, nothing changed. Saved the settings as my local preferences; again no change. * Added the three flags that you suggested; exited BOINC, restarted computer. Upon restart of BOINC, the task at the top of the list had a new status: "Waiting for memory." Fiddling around with different things and updating did not start the tasks or change the status of the lead task. * Went into prefs, selected the tab Disk and Memory, and under Memory changed the parameter "When computer is in use, use at most" from 50% to 90%. And at this, the tasks took off and started running again. * Not being content with success, and wanting answers, I changed the parameter back to 50%, and it is still running. Based on my experience plus that of one other use that I read, the underlying issue is that BOINC seems to think that it does not have enough available memory to begin running task, so that is where corrective efforts should be directed. If any new information comes my way, I will pass it along. Meanwhile, I'm back in the saddle again, yee-haw! Many thanks for your assistance with this problem. |
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1671 Credit: 17,527,680 RAC: 23,122 |
We both forgot the first rule of computer problem solving- re-boot. Upon restart of BOINC, the task at the top of the list had a new status: "Waiting for memory."Probably worth posting this issue to the BOINC boards. What usually happens is that if a Task uses more RAM than is available, or it requires more RAM than is available in order to start, then it's status becomes "Waiting for memory." It shouldn't be "Ready to start" (even though it is), if it can't actually start for some reason. The issue will eventually re-occur- when it does check the Event log to see if there are any messages there about insufficient RAM, even if it still shows the Tasks as "Ready to start." * Went into prefs, selected the tab Disk and Memory, and under Memory changed the parameter "When computer is in use, use at most" from 50% to 90%. And at this, the tasks took off and started running again.I've got my system set to 95% for both memory limit settings. With Rosetta you generally need to allow 1.3GB RAM per Task. Many are much less than that, some are 4GB or more. So with the amount of RAM that system has, and the number of cores/threads the CPU has, and the 50% limit on RAM usage, i would expect you will run in to the issue again in the future. NB I have nothing running on it when not using Zoom, and I halt BOINC when making a call (about once per week).In the BOINC Manager, Options, Exclusive applications. If you put the zoom executable name there, BOINC will automatically suspend processing when Zoom is running, and restart when you're done. Grant Darwin NT |
Bill F Send message Joined: 29 Jan 08 Posts: 44 Credit: 1,560,806 RAC: 1,735 |
People deserve a proper break and I'm not inclined to demand they're dragged in to suit, what are essentially, hobbyists.No need to go in, just do a remote login & restart. If it fixes it, good. If not, then it can wait till they do go in. For a Small project it has a good base, not counting Users that do not allow data export, 32,000+ Users on 88,000+ Systems contribute daily. Bill F In October 1969 I took an oath to support and defend the Constitution of the United States against all enemies, foreign and domestic; There was no expiration date. |
Link Send message Joined: 4 May 07 Posts: 356 Credit: 382,349 RAC: 0 |
I have nothing running on it when not using Zoom, and I halt BOINC when making a call (about once per week).In the BOINC Manager, Options, Exclusive applications. If you put the zoom executable name there, BOINC will automatically suspend processing when Zoom is running, and restart when you're done. I'd suggest to not suspend it at all and see if some issues occur. So far (and I'm crunching since 2003 on usually pretty old hardware), except for GPU applications, I never have seen any reason to suspend BOINC, if a non-BOINC application needs 100% of the CPU, it will get it. . |
mrhastyrib Send message Joined: 18 Feb 21 Posts: 90 Credit: 2,541,890 RAC: 0 |
I'd suggest to not suspend it at all and see if some issues occur....if a non-BOINC application needs 100% of the CPU, it will get it. I might try the exclusion thing that Grant suggested. The reason that I turned off BOINC was that on this laptop, which is optimized for weight and not performance, the latency of the handoff from BOINC to Zoom was a problem, not that it made the handoff at all. |
wolfman1360 Send message Joined: 18 Feb 17 Posts: 72 Credit: 18,450,036 RAC: 0 |
I'd suggest to not suspend it at all and see if some issues occur....if a non-BOINC application needs 100% of the CPU, it will get it. This is what I use, especially for meetings involving video and / or screen share. Just make sure you have enough memory to keep tasks suspended indefinitely without thrashing the swap. Not that Zoom uses a lot. |
Michael E.@ team Carl Sagan Send message Joined: 5 Apr 08 Posts: 16 Credit: 1,927,975 RAC: 506 |
The problem I reported earlier to Grant with tasks not running has been solved. It may apply to the memory issues reported recently. That is, tasks you expected to run were not running. I did not see anything in the event log that provided a clue (but I may need to enable certain messages). My PC has a small SSD disk so i was careful about how much disk space gets used. The same applies to memory use, although I check the Windows Task manager and see how much memory processes are using. On my son's 8 GB RAM PC, I cannot run two Rosetta tasks at the same time. If you see a task that should be running but is not, in Preferences (Options > Computing Preferences in the Advanced View), tap/click the Disk and Memory tab and check the settings there. Mike |
Mr P Hucker Send message Joined: 12 Aug 06 Posts: 1600 Credit: 11,716,372 RAC: 18,198 |
People deserve a proper break and I'm not inclined to demand they're dragged in to suit, what are essentially, hobbyists.No need to go in, just do a remote login & restart. If it fixes it, good. If not, then it can wait till they do go in. It's not that they don't allow it, it's that they didn't notice the communist EU ruling and have to tick a box. The EU thinking they can control the entire world. |
mrhastyrib Send message Joined: 18 Feb 21 Posts: 90 Credit: 2,541,890 RAC: 0 |
Me again. I added a new host today. For all of the other hosts in my account, when they download new work and it is queued and waiting to run, the "Time Remaining" value is 8:00:00 hours. However, I see that the queued work for my new host is 9:00:04 hours [sic]. I have dug through the message boards and see references to making changes to this, like it is user-defined, but nothing about how to actually change it. My questions: why the difference? Should I change it to 8:00:00 to be consistent with the other hosts, and if so, how to do that? |
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1671 Credit: 17,527,680 RAC: 23,122 |
My questions: why the difference? Should I change it to 8:00:00 to be consistent with the other hosts, and if so, how to do that?Don't worry about it. It should be 8 hours, and after it's processed several Tasks it will start heading towards 8 hours & will get there eventually as work is returned. Edit- i'd run the BOINC Manager benchmarks on the i7- it's showing the default values which are way less than what that system is capable of. Grant Darwin NT |
Brian Nixon Send message Joined: 12 Apr 20 Posts: 293 Credit: 8,432,366 RAC: 0 |
The initial run time estimate is always off for new hosts. As Grant said: don’t worry about it; give it a couple of days for client and server to agree on how long tasks will run for. Despite the inaccurate estimate, you should find that they will finish after 8 hours (of CPU time, not wall time) regardless. (Indeed they did.) You can change the target CPU run time in your project preferences. Reasons to reduce it include having a machine with severely limited availability (though if you can’t dedicate 8 hours of CPU time inside 72 hours of wall time, Rosetta@home might not be best suited for you anyway); reasons to increase it might be to reduce the total amount of network traffic between client and server. The credits per hour will be (more or less) the same whatever you choose. |
Message boards :
Number crunching :
Problems and Technical Issues with Rosetta@home
©2024 University of Washington
https://www.bakerlab.org