Message boards : Number crunching : Problems and Technical Issues with Rosetta@home
Previous · 1 . . . 100 · 101 · 102 · 103 · 104 · 105 · 106 . . . 300 · Next
Author | Message |
---|---|
Sid Celery Send message Joined: 11 Feb 08 Posts: 2115 Credit: 41,115,753 RAC: 19,563 |
Using the number of tasks In Progress as a proxy for how successful people are at downloading tasks Currently 384k in progress - loss reduced to 30% <guessing> maybe back-up project tasks are being replaced by Rosetta? Every little helps |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2115 Credit: 41,115,753 RAC: 19,563 |
I've been in contact with Project admins and this was a deliberate change, not a misconfiguration. There had been some talk of larger tasks for more capable machines in the past. You may well be right that it was an attempt to provide them. But on machines with lower available resources, they seem not to get anything rather than only being offered low-resource-reqt tasks. And now it seems <everything> needs large resources. I'm sure there's a better way of implementing the provision of appropriately-sized tasks, but no-one's hit on it yet. Perhaps it needs info from the host requesting tasks first. But I'm guessing again. |
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1671 Credit: 17,531,042 RAC: 22,700 |
They were still going through Yesterday, but given the low percentage of errors i didn't consider them to be an issue. That you did have such a high number of errors indicated that there was something going on with your system.I've had a couple of miniprotein_relax8_ error out after a while with a similar error messageHaven't all those tasks been aborted by the server now? Grant Darwin NT |
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1671 Credit: 17,531,042 RAC: 22,700 |
The disk I/O from BONC projects is bugger all as a factor of DWPD (Drive Writes Per Day), even for a system with 64 cores/128 threads all in use.Depends what you mean by normal. Mine has a security camera recording onto it, two graphics cards and a 24 core CPU doing Boinc, I record TV to it, .... I guess there are some people who just play solitaire and use email, those might last that long.And as i indicated with that link i posted, you are talking about decades for normal drives under normal usage conditions.SSD Endurance ExperimentI've read many articles complaining that SSDs last nowhere near as long as HDDs. A few HDDs do fail unexpectedly, but SSDs wear out, because they have a finite number of writes. They cannot possibly last longer than that time. And SSDs used for recording video streams 24/7 will also last just as long if they have plenty of free space (30% or more) to allow for garbage collection & wear levelling to occur as needed. Grant Darwin NT |
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1671 Credit: 17,531,042 RAC: 22,700 |
I'm sure there's a better way of implementing the provision of appropriately-sized tasks, but no-one's hit on it yet.There's a simple quick & dirty method that would be easy for the project to implement. The present application is v 4.2x The project compiles another copy, exactly the same, and calls it v5.2x and uses that one for processing large RAM requirement Tasks. In the Rosetta@home preferences they give the option of which version to run. The default for current & new users is v4.2x People can choose to also process large RAM tasks by selecting v5.2x eg Default settings Run only the selected applications Rosetta v4: yes Rosetta v5: no If no work for selected applications is available, accept work from other applications? no Settings for those that choose to run large RAM Tasks. Run only the selected applications Rosetta v4: yes Rosetta v5: yes If no work for selected applications is available, accept work from other applications? no People can also choose to run just the one type, but do the other type if their preferred type isn't available at the time they request work by setting the bottom line "If no work..." to yes, When a Work Unit is created, the researcher flags which application needs to be used to process it- Regular or large RAM requirement. That way any Task that requires large amounts of RAM, will only go to systems that are capable of handling it (if the user pays attention to the requirements before selecting the option to do those types of Tasks....). Of course when they move beyond v4, they'd need to go to v6 for regular Tasks, and v7 for large RAM Tasks, and update the Rosetta preferences page, and let people know what's happening before hand. Grant Darwin NT |
Mr P Hucker Send message Joined: 12 Aug 06 Posts: 1600 Credit: 11,716,372 RAC: 18,198 |
It's my computer and they can't make me do anything, including pay for it.[quote]From Sid Celery 31 Mar9 Apr |
Mr P Hucker Send message Joined: 12 Aug 06 Posts: 1600 Credit: 11,716,372 RAC: 18,198 |
It could also be people manually doing other things. I sometimes like to concentrate on one project. If that runs out of work, I'll pick another and might not be back for a while. Somebody just knocked me into 3rd place elsewhere, this will not do, back in a week....Using the number of tasks In Progress as a proxy for how successful people are at downloading tasks |
Mr P Hucker Send message Joined: 12 Aug 06 Posts: 1600 Credit: 11,716,372 RAC: 18,198 |
I don't know exactly how they work, but let's say I have mine 80% full of stuff that remains there. Then I repeatedly write to the remaining 20%. That 20% will wear out. When there aren't enough spare bits to reallocate, won't it just say "I'm now a smaller disk"? 80% of the drive is pretty much as new!The disk I/O from BONC projects is bugger all as a factor of DWPD (Drive Writes Per Day), even for a system with 64 cores/128 threads all in use.Depends what you mean by normal. Mine has a security camera recording onto it, two graphics cards and a 24 core CPU doing Boinc, I record TV to it, .... I guess there are some people who just play solitaire and use email, those might last that long.And as i indicated with that link i posted, you are talking about decades for normal drives under normal usage conditions.SSD Endurance ExperimentI've read many articles complaining that SSDs last nowhere near as long as HDDs. A few HDDs do fail unexpectedly, but SSDs wear out, because they have a finite number of writes. They cannot possibly last longer than that time. |
Mr P Hucker Send message Joined: 12 Aug 06 Posts: 1600 Credit: 11,716,372 RAC: 18,198 |
An excellent idea, although doesn't the server know how much RAM I have? It does, I just checked in the likes of this:I'm sure there's a better way of implementing the provision of appropriately-sized tasks, but no-one's hit on it yet.There's a simple quick & dirty method that would be easy for the project to implement. https://boinc.bakerlab.org/rosetta/show_host_detail.php?hostid=3746264 It could only send large tasks to people who have say at least 16GB of RAM, or even xGB RAM per core. |
adrianxw Send message Joined: 18 Sep 05 Posts: 653 Credit: 11,840,739 RAC: 204 |
I've had a couple of work units crash out after 30-40 seconds this afternoon. Exit status 0x00000001. Wave upon wave of demented avengers march cheerfully out of obscurity into the dream. |
mikey Send message Joined: 5 Jan 06 Posts: 1895 Credit: 9,118,186 RAC: 5,220 |
[quote]From Sid Celery 31 Mar9 Apr LOL!!! They are Borg and you will be assimilated!!! |
mrhastyrib Send message Joined: 18 Feb 21 Posts: 90 Credit: 2,541,890 RAC: 0 |
let people know what's happening before hand. HA HA HA HA HA!! [wipes tears from eyes] HOO HEE HA HA HA!!! |
mrhastyrib Send message Joined: 18 Feb 21 Posts: 90 Credit: 2,541,890 RAC: 0 |
wear levelling It was literally right there in the post that you quoted. |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2115 Credit: 41,115,753 RAC: 19,563 |
They were still going through Yesterday, but given the low percentage of errors i didn't consider them to be an issue. That you did have such a high number of errors indicated that there was something going on with your system.I've had a couple of miniprotein_relax8_ error out after a while with a similar error messageHaven't all those tasks been aborted by the server now? You make a good point tbf. I'm getting even more errors atm, but without the rebooting of the PC. Something bad definitely going on with my machine, but with everything else happening it's been hard for me to determine the cause up to now. Time for some tweaking #brb |
robertmiles Send message Joined: 16 Jun 08 Posts: 1232 Credit: 14,269,631 RAC: 3,846 |
I've had a couple of work units crash out after 30-40 seconds this afternoon. Exit status 0x00000001. These tasks? https://boinc.bakerlab.org/rosetta/results.php?hostid=3117659&offset=0&show_names=0&state=6&appid= If so, the task logs appear to show that attempting to extract one input file each from the database failed, probably because they weren't in the database. |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2115 Credit: 41,115,753 RAC: 19,563 |
I'm sure there's a better way of implementing the provision of appropriately-sized tasks, but no-one's hit on it yet.There's a simple quick & dirty method that would be easy for the project to implement. Neat idea. I'll let you tell them... |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2115 Credit: 41,115,753 RAC: 19,563 |
They were still going through Yesterday, but given the low percentage of errors i didn't consider them to be an issue. That you did have such a high number of errors indicated that there was something going on with your system.I've had a couple of miniprotein_relax8_ error out after a while with a similar error messageHaven't all those tasks been aborted by the server now? And seconds after posting, my PC blue-screened. Almost like it knew I was talking about it. Tweak done - let's see how it goes. |
robertmiles Send message Joined: 16 Jun 08 Posts: 1232 Credit: 14,269,631 RAC: 3,846 |
I'm sure there's a better way of implementing the provision of appropriately-sized tasks, but no-one's hit on it yet.There's a simple quick & dirty method that would be easy for the project to implement. I'd prefer to see the large RAM tasks marked by adding a letter to the application name, so that it does not interfere with the version numbering. This allows adding a different letter for yet another RAM size. Otherwise, a good idea. |
adrianxw Send message Joined: 18 Sep 05 Posts: 653 Credit: 11,840,739 RAC: 204 |
>>> These tasks? Err, yes, err, obviously... >>> If so, the task logs appear to show that attempting to extract one input file each from the database failed, probably because they weren't in the database. Indeed, it was a problem with Rosetta@home, which is why I mentioned it in the thread called "Problems and Technical Issues with Rosetta@home"... Wave upon wave of demented avengers march cheerfully out of obscurity into the dream. |
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1671 Credit: 17,531,042 RAC: 22,700 |
Of course if you were to treat a HDD the way you described, you would considerably shorten it's life expectancy as well.And SSDs used for recording video streams 24/7 will also last just as long if they have plenty of free space (30% or more) to allow for garbage collection & wear levelling to occur as needed.I don't know exactly how they work, but let's say I have mine 80% full of stuff that remains there. Then I repeatedly write to the remaining 20%. That 20% will wear out. When there aren't enough spare bits to reallocate, won't it just say "I'm now a smaller disk"? 80% of the drive is pretty much as new! HDDs that spend all their time thrashing tend to die very young. You did see where i wrote about leaving sufficient free space? At least 30%? If there is only 20% free space on the drive (SSD or HDD) it is for all intents and purposes full and should be replaced with a much larger unit, or more spaced freed up. Grant Darwin NT |
Message boards :
Number crunching :
Problems and Technical Issues with Rosetta@home
©2024 University of Washington
https://www.bakerlab.org