Problems and Technical Issues with Rosetta@home

Message boards : Number crunching : Problems and Technical Issues with Rosetta@home

To post messages, you must log in.

Previous · 1 . . . 138 · 139 · 140 · 141 · 142 · 143 · 144 . . . 300 · Next

AuthorMessage
Jim1348

Send message
Joined: 19 Jan 06
Posts: 881
Credit: 52,257,545
RAC: 0
Message 103582 - Posted: 28 Nov 2021, 17:31:50 UTC - in response to Message 103577.  

no more work ? really?

I am continuing to get work, though most of it is _1.

And the new ones seem to be taking longer, or at least the estimates are.
It may just be how my machines are set up. I am doing more work units now, but they may be cache-limited.
ID: 103582 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
.clair.

Send message
Joined: 2 Jan 07
Posts: 274
Credit: 26,399,595
RAC: 0
Message 103583 - Posted: 28 Nov 2021, 19:33:58 UTC - in response to Message 103582.  

no more work ? really?

I am continuing to get work, though most of it is _1.
And the new ones seem to be taking longer, or at least the estimates are.
It may just be how my machines are set up. I am doing more work units now, but they may be cache-limited.

You may be getting the ones that I `aborted` . `errored` . `crashed`
wimin drivers . . . . :) Nnnnnn,,
ID: 103583 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jim1348

Send message
Joined: 19 Jan 06
Posts: 881
Credit: 52,257,545
RAC: 0
Message 103584 - Posted: 28 Nov 2021, 21:56:26 UTC - in response to Message 103583.  

Abort some more. They are really out of pythons now, though I did pick up a few of the regular Rosettas.
But even they seem to be out now. On some of my machines, I can make it until tomorrow. On others, I can't.
ID: 103584 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5691
Credit: 5,859,226
RAC: 0
Message 103585 - Posted: 28 Nov 2021, 23:19:00 UTC - in response to Message 103584.  

Abort some more. They are really out of pythons now, though I did pick up a few of the regular Rosettas.
But even they seem to be out now. On some of my machines, I can make it until tomorrow. On others, I can't.



Hurry up...3 left
ID: 103585 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jim1348

Send message
Joined: 19 Jan 06
Posts: 881
Credit: 52,257,545
RAC: 0
Message 103586 - Posted: 28 Nov 2021, 23:55:25 UTC - in response to Message 103585.  

Hurry up...3 left
Yes, you have to get them when you can. But I pick up a few more from time to time, so I should make it until tomorrow.
Hopefully they will throw some more in the hopper.
ID: 103586 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
nwayno

Send message
Joined: 28 May 20
Posts: 6
Credit: 7,006,260
RAC: 29
Message 103588 - Posted: 29 Nov 2021, 4:07:13 UTC - in response to Message 80629.  

Yes there has been no work units for several weeks. I switched to World Community Grid. My raspberry pi's have nothing to do, so I am powering those off.

It would certainly help, as you said something like: Yeah, it's broke, we're working on it. I will check in again after the first of the year as well.
ID: 103588 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1673
Credit: 17,596,129
RAC: 22,414
Message 103589 - Posted: 29 Nov 2021, 6:44:04 UTC - in response to Message 103588.  

Yes there has been no work units for several weeks.
Not true.
There have been periods over the last 3 weeks where there have been no new Rosetta 4.20 Tasks available from the project- there were 4 days with no new work from Nov 11th, after that it was generally 1-2 days between spurts of new Rosetta 4.20 work. Along with the occasional batch of RB Tasks being sent out as well.
But it has been just the last 36 hours or so where there has been no new Python work available either, just the very occasional resend.
Grant
Darwin NT
ID: 103589 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jean-David Beyer

Send message
Joined: 2 Nov 05
Posts: 187
Credit: 6,370,517
RAC: 5,725
Message 103591 - Posted: 29 Nov 2021, 13:51:54 UTC - in response to Message 103586.  

Yes, you have to get them when you can. But I pick up a few more from time to time, so I should make it until tomorrow.
Hopefully they will throw some more in the hopper.

I see there are lots of them in the hopper,

Rosetta 3674 63066 6.88 (0.24 - 56.39) 2050

and my machine asks for some, but does not get any.

Mon 29 Nov 2021 08:40:33 AM EST | Rosetta@home | update requested by user
Mon 29 Nov 2021 08:40:38 AM EST | Rosetta@home | Sending scheduler request: Requested by user.
Mon 29 Nov 2021 08:40:38 AM EST | Rosetta@home | Requesting new tasks for CPU
Mon 29 Nov 2021 08:40:40 AM EST | Rosetta@home | Scheduler request completed: got 0 new tasks
Mon 29 Nov 2021 08:40:40 AM EST | Rosetta@home | No tasks sent
Mon 29 Nov 2021 08:40:40 AM EST | Rosetta@home | Project requested delay of 31 seconds
Mon 29 Nov 2021 08:41:15 AM EST | Rosetta@home | Sending scheduler request: To fetch work.
Mon 29 Nov 2021 08:41:15 AM EST | Rosetta@home | Requesting new tasks for CPU
Mon 29 Nov 2021 08:41:16 AM EST | Rosetta@home | Scheduler request completed: got 0 new tasks
Mon 29 Nov 2021 08:41:16 AM EST | Rosetta@home | No tasks sent

ID: 103591 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Peter Humphrey

Send message
Joined: 26 Jul 18
Posts: 5
Credit: 4,256,666
RAC: 4,788
Message 103592 - Posted: 29 Nov 2021, 16:35:24 UTC - in response to Message 103553.  

In that case I may decide I can't afford to stay with this project. It's far too much of a memory hog; I've suspended it while I debate with myself. Who'd have though that 64GB RAM would be too little, even with 24 processors?
ID: 103592 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jim1348

Send message
Joined: 19 Jan 06
Posts: 881
Credit: 52,257,545
RAC: 0
Message 103593 - Posted: 29 Nov 2021, 17:11:46 UTC - in response to Message 103592.  

Who'd have though that 64GB RAM would be too little, even with 24 processors?

If you are willing to jump through some hoops (though they are actually rather easy), there is a way, by running multiple BOINC instances.
https://boinc.bakerlab.org/rosetta/forum_thread.php?id=6893&postid=103516#103516

You then run as many as you can (8 for example) in each instance. That works because you only need a lot of memory to download them, not run them.
Here is how to set it up:
https://www.overclock.net/threads/guide-setting-up-multiple-boinc-instances.1628924/

I already had a second BOINC instance set up on a Ryzen 3900X with 96 GB of memory, so I can use all 24 cores (12 per instance).
Also, I added a second BOINC instance to a Ryzen 3950X with 128 GB of memory.
They are all under Ubuntu 20.04.3, but it works on Windows as well. It is just a bit easier to start up automatically in Linux.

You can use three BOINC instances (or more) if you need to.
ID: 103593 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jean-David Beyer

Send message
Joined: 2 Nov 05
Posts: 187
Credit: 6,370,517
RAC: 5,725
Message 103594 - Posted: 29 Nov 2021, 22:40:16 UTC - in response to Message 103553.  

I messed around with app_config but that can make a mess of things.


I am using this, and that sets my upper bound to three at a time. What would be the symptoms of the mess to which you refer?

[/var/lib/boinc/projects/boinc.bakerlab.org_rosetta]$ cat app_config.xml 
<app_config>
   <project_max_concurrent>3</project_max_concurrent>
</app_config>

ID: 103594 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5691
Credit: 5,859,226
RAC: 0
Message 103595 - Posted: 29 Nov 2021, 22:58:37 UTC - in response to Message 103594.  

I messed around with app_config but that can make a mess of things.


I am using this, and that sets my upper bound to three at a time. What would be the symptoms of the mess to which you refer?

[/var/lib/boinc/projects/boinc.bakerlab.org_rosetta]$ cat app_config.xml 
<app_config>
   <project_max_concurrent>3</project_max_concurrent>
</app_config>



project? I have never seen that before.
Most of us tried to use max_concurrent and then got buried in tons of tasks we could never complete by their deadlines.
ID: 103595 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5691
Credit: 5,859,226
RAC: 0
Message 103596 - Posted: 29 Nov 2021, 23:01:43 UTC

Something does not make sense
Says 2,000 tasks queued.
Had a look at the schedulers...0 on all projects.
So are the 2,000 not released yet or have they all been taken and the system did not update?
ID: 103596 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jim1348

Send message
Joined: 19 Jan 06
Posts: 881
Credit: 52,257,545
RAC: 0
Message 103597 - Posted: 29 Nov 2021, 23:38:29 UTC - in response to Message 103595.  

project? I have never seen that before.
Most of us tried to use max_concurrent and then got buried in tons of tasks we could never complete by their deadlines.

Project_max_current will limit the total number of work units running for all projects.

But either one of them can cause the problem of excessive downloads.
https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=5720&postid=45319
https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=5720&postid=45323
ID: 103597 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Falconet

Send message
Joined: 9 Mar 09
Posts: 353
Credit: 1,227,479
RAC: 3,710
Message 103598 - Posted: 30 Nov 2021, 1:11:40 UTC - in response to Message 103596.  

The queue only updates once around every 4 hours while the server status page is around 30 minutes or so.
ID: 103598 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5691
Credit: 5,859,226
RAC: 0
Message 103603 - Posted: 30 Nov 2021, 7:13:14 UTC - in response to Message 103597.  

project? I have never seen that before.
Most of us tried to use max_concurrent and then got buried in tons of tasks we could never complete by their deadlines.

Project_max_current will limit the total number of work units running for all projects.

But either one of them can cause the problem of excessive downloads.
https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=5720&postid=45319
https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=5720&postid=45323



So it comes down to this, any attempt to limit the amount of tasks will cause excessive downloads.
What if you rolled back in versions of BOINC?
ID: 103603 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Bryn Mawr

Send message
Joined: 26 Dec 18
Posts: 389
Credit: 12,073,013
RAC: 8,289
Message 103608 - Posted: 30 Nov 2021, 9:16:35 UTC - in response to Message 103603.  

project? I have never seen that before.
Most of us tried to use max_concurrent and then got buried in tons of tasks we could never complete by their deadlines.

Project_max_current will limit the total number of work units running for all projects.

But either one of them can cause the problem of excessive downloads.
https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=5720&postid=45319
https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=5720&postid=45323



So it comes down to this, any attempt to limit the amount of tasks will cause excessive downloads.
What if you rolled back in versions of BOINC?


Not will, can.

I’ve been running project_max_concurrent on most projects for several years with no excess downloads.
ID: 103608 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Peter Humphrey

Send message
Joined: 26 Jul 18
Posts: 5
Credit: 4,256,666
RAC: 4,788
Message 103610 - Posted: 30 Nov 2021, 11:53:55 UTC - in response to Message 103593.  

You then run as many as you can (8 for example) in each instance. That works because you only need a lot of memory to download them, not run them. Here is how to set it up:


I don't understand. When I run boincmgr it shows several jobs as "waiting for memory". How can adding yet more of them release memory? And why would boinc need more memory to download a job than to run it?

(This is Gentoo Linux.)
ID: 103610 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Falconet

Send message
Joined: 9 Mar 09
Posts: 353
Credit: 1,227,479
RAC: 3,710
Message 103611 - Posted: 30 Nov 2021, 12:04:53 UTC - in response to Message 103610.  

You then run as many as you can (8 for example) in each instance. That works because you only need a lot of memory to download them, not run them. Here is how to set it up:


I don't understand. When I run boincmgr it shows several jobs as "waiting for memory". How can adding yet more of them release memory? And why would boinc need more memory to download a job than to run it?

(This is Gentoo Linux.)



Read my post. I hope it helps.

https://boinc.bakerlab.org/rosetta/forum_thread.php?id=6893&postid=103572
ID: 103611 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jim1348

Send message
Joined: 19 Jan 06
Posts: 881
Credit: 52,257,545
RAC: 0
Message 103612 - Posted: 30 Nov 2021, 14:02:46 UTC - in response to Message 103611.  

Falconet has the right answer. But I would only add that it is the project that sets the memory requirements, not BOINC.
If they say more, then BOINC just obeys. (The memory isn't released, it is just reserved.)

And adding a second BOINC instance gives you another bite at the apple. One BOINC instance doesn't see what the other one is doing.
So if the pythons ever do require more memory to run, that could cause problems. But we are a long way from that at the moment.
ID: 103612 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Previous · 1 . . . 138 · 139 · 140 · 141 · 142 · 143 · 144 . . . 300 · Next

Message boards : Number crunching : Problems and Technical Issues with Rosetta@home



©2024 University of Washington
https://www.bakerlab.org