Message boards : Number crunching : Problems and Technical Issues with Rosetta@home
Previous · 1 . . . 39 · 40 · 41 · 42 · 43 · 44 · 45 . . . 300 · Next
Author | Message |
---|---|
robertmiles Send message Joined: 16 Jun 08 Posts: 1232 Credit: 14,269,631 RAC: 3,846 |
Hi all, new to Rosetta. I have been running Seti@Home since last fall with no issues. I added Rosetta@ home on two PC's. Upon re-boot Rosetta is removed from the list projects list. I have to add new task, select Rosetta and re-enter password info. Seti does not do this. What am I missing ?? I am using a different password from Seti if that makes a difference. Something simple that seems worth trying: Wait for at least one task to finish, In the advanced view, Click on Activity, then Suspend. Wait at least one more minute before shutting down BOINC, Windows, or the computer. Let us know if doing this even once helps. |
Kevin N. Carpenter Send message Joined: 6 Apr 20 Posts: 6 Credit: 6,614,362 RAC: 0 |
Moderator removed my post from the "moderator contact point assistence post here thread" so here goes. That is unusual. I added Rosetta to about a half-dozen machines yesterday and have not had that problem. Many have been restarted without issues. Sometimes its takes a few minutes for Rosetta to start back up, but that is all. |
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1670 Credit: 17,523,845 RAC: 23,480 |
Hi all, new to Rosetta. I have been running Seti@Home since last fall with no issues. I added Rosetta@ home on two PC's. Upon re-boot Rosetta is removed from the list projects list. I have to add new task, select Rosetta and re-enter password info. Seti does not do this. What am I missing ?? I am using a different password from Seti if that makes a difference.Not sure what the issue is. I just used the BOINC Manager, Tools, Add project, select Rosetta@home, New user, different password to my other project and everything went along OK from there. I didn't reboot the system till a few days after joining up, and the project was still there when it did reboot. Grant Darwin NT |
Stephen "Heretic" Send message Joined: 2 Apr 20 Posts: 21 Credit: 11,028 RAC: 0 |
A Windows XP pc connected to internet? A Pentium 4? Yeah.... well.... the problem seems self explanatory, too old... . . They have revised the app from 4.07/4.08 to 4.12. This new version asks a bit more from the hardware and seems to be a bit of a memory hog, see the thread on use of memory and Rosetta. I have an i5 with 8GB ram and I have had problems getting it to run reliably. But there may be another cause. Stephen ? |
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1670 Credit: 17,523,845 RAC: 23,480 |
. . They have revised the app from 4.07/4.08 to 4.12. This new version asks a bit more from the hardware and seems to be a bit of a memory hogThe Task being processed determines how much RAM is required. Present Tasks are using much less RAM (400MB to 800MB max) than ones from a week ago (400MB to 1.3GB max). Grant Darwin NT |
MarkJ Send message Joined: 28 Mar 20 Posts: 72 Credit: 25,238,680 RAC: 0 |
Got 56 CF_monomer/Rosetta Mini work units that all failed with an instant "Error while computing". Running Linux x64. Other work units seem fine, just not these ones. I ended up aborting the few that hadn't committed suicide. I notice the scheduler is down now so maybe they are removing them from the queue. Example: https://boinc.bakerlab.org/rosetta/result.php?resultid=1142716718 Stderr <core_client_version>7.16.1</core_client_version> <![CDATA[ <message> process exited with code 255 (0xff, -1)</message> <stderr_txt> [2020- 4- 8 8:49:35:] :: BOINC:: Initializing ... ok. [2020- 4- 8 8:49:35:] :: BOINC :: boinc_init() BOINC:: Setting up shared resources ... ok. BOINC:: Setting up semaphores ... ok. BOINC:: Updating status ... ok. BOINC:: Registering timer callback... ok. BOINC:: Worker initialized successfully. command: ../../projects/boinc.bakerlab.org_rosetta/minirosetta_3.78_x86_64-pc-linux-gnu -abinitio::fastrelax 1 -ex2aro 1 -frag3 00001.200.3mers -in:file:native 00001.pdb -corrections::beta_nov16 -silent_gz 1 -frag9 00001.200.9mers -out:file:silent default.out -ex1 1 -abinitio::rsd_wt_loop 0.5 -relax::default_repeats 15 -abinitio::use_filters false -abinitio::increase_cycles 10 -abinitio::rsd_wt_helix 0.5 -abinitio::rg_reweight 0.5 -in:file:boinc_wu_zip CF_monomer_28_data.zip -out:file:silent default.out -silent_gz -mute all -nstruct 10000 -cpu_run_time 28800 -boinc:max_nstruct 600 -checkpoint_interval 120 -database minirosetta_database -in::file::zip minirosetta_database.zip -boinc::watchdog -run::rng mt19937 -constant_seed -jran 2362735 Registering options.. Registered extra options. Initializing broker options ... Registered extra options. Initializing core... Initializing options.... ok Options::initialize() Options::adding_options() Options::initialize() Check specs. Options::initialize() End reached ERROR: Option matching -corrections:beta_nov16 not found in command line top-level context </stderr_txt> ]]> BOINC blog |
shanen Send message Joined: 16 Apr 14 Posts: 195 Credit: 12,662,308 RAC: 0 |
Trying and even eager to be of help, but... All these short deadline units are troublesome. Is it accomplishing anything if my contributions are just discarded? And discarded for the sake of deadlines that seem quite arbitrary, even silly. Exacerbated by more checkpoint problems, too. Actually writing from the machine that has the most problems dealing with the deadlines, but even some of my bigger machines clearly have more queued tasks than they can possibly complete within the short deadlines. Obvious workaround (though it's tedious) is to manually abort the tasks that can't be completed, but that causes problems because the flow of tasks has become sporadic again... Plus its wasting the bandwidth at the project end when they send data that is just discarded. On top of that, some of the machines wind up wasting time because of large batches of tasks with large memory requirements that cause the "Waiting for memory" status on some tasks. Again, selective nuking of tasks can get the CPU's busy again, but I'm NOT supposed to be spending time managing memory problems because the people running Rosetta@home can't figure it out... I'm fairly confident that BOINC has those capabilities to assess and manage memory, but it seems they are not being used by the Baker Lab people. I've currently earned over 12 million points, which is supposed to indicate a moderate contribution, but I'm thinking about moving along. The reason I switched to Rosetta was because the projects I used to support were not well managed. I'm sure I could even shop around for projects that are also working on Covid projects. In addition, if I were still supporting researchers, I would not recommend that they rely on data processed on Rosetta because such problems make the entire thing dubious... There were a couple of teams in the lab that are probably doing Covid stuff now (but I'm retired, so I have no idea). #1 Freedom = (Meaningful - Constrained) Choice{5} != (Beer^3 | Speech) |
MarkJ Send message Joined: 28 Mar 20 Posts: 72 Credit: 25,238,680 RAC: 0 |
Now up to 68 with a few more waiting to run. Can we get these removed from the queue please. Looking at my wing man on a few of them they too are failing so its not just me. I don't want to get my hosts blocked from getting tasks due to what looks like a parameter error with these tasks. BOINC blog |
robertmiles Send message Joined: 16 Jun 08 Posts: 1232 Credit: 14,269,631 RAC: 3,846 |
Got 56 CF_monomer/Rosetta Mini work units that all failed with an instant "Error while computing". Running Linux x64. Other work units seem fine, just not these ones. I ended up aborting the few that hadn't committed suicide. [snip] I got a few under 64 bit Windows 10. Those hit errors within a few seconds of starting. |
SIXER (L.Gammel) Send message Joined: 6 Apr 20 Posts: 1 Credit: 393,432 RAC: 0 |
I also switched from seti to rosetta but am getting a message about disc space please help; Rosetta@home: Notice from server Rosetta Mini needs 183.28MB more disk space. You currently have 1533.33 MB available and it needs 1716.61 MB. what do I need to do??? Sixer |
Marcos Carot Send message Joined: 30 Dec 11 Posts: 3 Credit: 301,124 RAC: 0 |
Yes, I got the same message... just had to make space deleting other stuff in the partition where BOINC stores the data. In linux, it is not Home, I think it was /var |
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1670 Credit: 17,523,845 RAC: 23,480 |
I also switched from seti to rosetta but am getting a message about disc space please help;Give it more disk space. In your Account, Computing preferences Disk Use no more than 12GB Leave at least 2GB free Use no more than 60% of total Memory When computer is in use, use at most 90% When computer is not in use, use at most 95%Even so, you will also need to add more RAM, or reduce the number of CPU cores/threads you use (down to 6 or even 5) so you don't run out of RAM (as much as 1.3GB of RAM per task can be required, although present tasks are only using 800MB or less). Grant Darwin NT |
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1670 Credit: 17,523,845 RAC: 23,480 |
Trying and even eager to be of help, but...Simple. Set a small cache, 1 day or less, additional days .02 or so. Make sure your checkpointing request is set for 60 seconds. Set the number of threads used for crunching equal to or less than the number that can be supported with the RAM your systems have allowing for 1.3GB per task (unless you wish to add more- that also solves the problem) and in your computing preferences allow BOINC to make use of the RAM you have (set it for 90% or higher). Don't abort the Tasks, if they miss the deadline the project will resend them to another host. Work gets done, errors reduced (if not eliminated), the Manager will figure out how much work you can & can't do and stop getting too much, and it won't require frequent intervention on your part. Grant Darwin NT |
michelv Send message Joined: 28 Mar 20 Posts: 8 Credit: 216,762 RAC: 0 |
I got a few under 64 bit Windows 10. Those hit errors within a few seconds of starting. Same here, just a few minutes ago. |
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1670 Credit: 17,523,845 RAC: 23,480 |
Got 56 CF_monomer/Rosetta Mini work units that all failed with an instant "Error while computing". Running Linux x64. Other work units seem fine, just not these ones. I ended up aborting the few that hadn't committed suicide. Decided to give mine a try (WIn10). one Task no problem, crunching away using minirosetta_3.78_windows_x86_64.exe On the other system, every one was an instant Computation error and they were set to run with Rosetta Mini v3.78 windows_intelx86, and there was one Task using Rosetta Mini v3.78 windows_x86_64 that errored out as well. Presently processing normally (40min so far so good) CF_monomer_12_fold_SAVE_ALL_OUT_905409_765_0 Rosetta Mini v3.78 windows_x86_64 All failed instantly CF_monomer_78_fold_SAVE_ALL_OUT_905546_126_1 Rosetta Mini v3.78 windows_intelx86 CF_monomer_37_fold_SAVE_ALL_OUT_905505_203_0 Rosetta Mini v3.78 windows_intelx86 CF_monomer_103_fold_SAVE_ALL_OUT_905615_59_0 Rosetta Mini v3.78 windows_intelx86 CF_monomer_103_fold_SAVE_ALL_OUT_905620_58_0 Rosetta Mini v3.78 windows_intelx86 CF_monomer_103_relax_SAVE_ALL_OUT_905662_69_0 Rosetta Mini v3.78 windows_x86_64 CF_monomer_103_fold_SAVE_ALL_OUT_905589_56_0 Rosetta Mini v3.78 windows_intelx86 CF_monomer_103_fold_SAVE_ALL_OUT_905646_12_1 Rosetta Mini v3.78 windows_intelx86 <core_client_version>7.6.33</core_client_version> <![CDATA[ <message> (unknown error) - exit code -1 (0xffffffff) </message> <stderr_txt> [2020- 4- 8 15:26:42:] :: BOINC:: Initializing ... ok. [2020- 4- 8 15:26:42:] :: BOINC :: boinc_init() BOINC:: Setting up shared resources ... ok. BOINC:: Setting up semaphores ... ok. BOINC:: Updating status ... ok. BOINC:: Registering timer callback... ok. BOINC:: Worker initialized successfully. command: projects/boinc.bakerlab.org_rosetta/minirosetta_3.78_windows_x86_64.exe -abinitio::fastrelax 1 -ex2aro 1 -frag3 00001.200.3mers -in:file:native 00001.pdb -corrections::beta_nov16 -silent_gz 1 -frag9 00001.200.9mers -out:file:silent default.out -ex1 1 -abinitio::rsd_wt_loop 0.5 -relax::default_repeats 15 -abinitio::use_filters false -abinitio::increase_cycles 10 -abinitio::rsd_wt_helix 0.5 -abinitio::rg_reweight 0.5 -in:file:boinc_wu_zip CF_monomer_103_data.zip -out:file:silent default.out -silent_gz -mute all -nstruct 10000 -cpu_run_time 28800 -boinc:max_nstruct 600 -checkpoint_interval 120 -database minirosetta_database -in::file::zip minirosetta_database.zip -boinc::watchdog -run::rng mt19937 -constant_seed -jran 2047753 Registering options.. Registered extra options. Initializing broker options ... Registered extra options. Initializing core... Initializing options.... ok Options::initialize() Options::adding_options() Options::initialize() Check specs. Options::initialize() End reached ERROR: Option matching -corrections:beta_nov16 not found in command line top-level context </stderr_txt> ]]> Grant Darwin NT |
MarkJ Send message Joined: 28 Mar 20 Posts: 72 Credit: 25,238,680 RAC: 0 |
<snipped> Running a 0.33 day cache on my 6 core/12 thread machines with 32GB of memory. On the couple of Pi's that I have running Rosetta I had to create an app_config to limit the number of tasks running at once. BOINC blog |
jackielan2000 Send message Joined: 5 Sep 06 Posts: 13 Credit: 14,208 RAC: 0 |
A Windows XP pc connected to internet? A Pentium 4? Yeah.... well.... the problem seems self explanatory, too old... 8G RAM? Well, that means it can only be run in x64 systems. Ok, bye bye Rosetta. |
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1670 Credit: 17,523,845 RAC: 23,480 |
8G RAM? Well, that means it can only be run in x64 systems. Ok, bye bye Rosetta.You need up to 1.3GB RAM per Task you run. So even a system with less than 4GB can run 2 Tasks. Grant Darwin NT |
Rames Send message Joined: 12 Nov 16 Posts: 5 Credit: 3,000,551 RAC: 0 |
Hi there! I am having calculation error problems too. I have 2 older 45nm opteron server with 24gb ram 12threads each, other can only run 5 rosetta threads at this moment, all others fail. Both have linux with old hdd. Other server have now 2 running tasks and i can tell later when wcg projects are done, but ive seen some errors already. With lower ram perthread 32nm Xeon servers are running fine with latest rosetta version, same with ryzen based computers. |
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1670 Credit: 17,523,845 RAC: 23,480 |
I am having calculation error problems too.With your systems hidden it's difficult to help, but there are faulty Rosetta Mini Tasks about at present that fail pretty much instantly. Grant Darwin NT |
Message boards :
Number crunching :
Problems and Technical Issues with Rosetta@home
©2024 University of Washington
https://www.bakerlab.org