Message boards : Number crunching : computation errors
Previous · 1 · 2
Author | Message |
---|---|
adrianxw Send message Joined: 18 Sep 05 Posts: 653 Credit: 11,840,739 RAC: 137 |
I am a little puzzled by your message actually. >>> Did you even pay the slightest bit of attention to the statement i was responding to??? Well, I didn't really need to since the statement you appeared to be responding to was written by me? I do know how BOINC works, I've been with it, and Davids earlier version, (probably not the right word, but it conveys the meaning, anyway, since 1998), since its first appearence. Wave upon wave of demented avengers march cheerfully out of obscurity into the dream. |
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1673 Credit: 17,595,878 RAC: 22,416 |
I am a little puzzled by your message actually.Not nearly as puzzled as i am by yours. >>> Did you even pay the slightest bit of attention to the statement i was responding to???Which makes your response all the more baffling to me- i would have hoped you'd realise what it was you had said, and so would understand what was being answered. Yet that didn't happen- you still quoted me out of context and took my answer for one statement, and applied it to something different. I do know how BOINC works, I've been with it, and Davids earlier version, (probably not the right word, but it conveys the meaning, anyway, since 1998), since its first appearence.And yet you attributed normal BOINC behaviour to some action by the project. Here is you statement- Perhaps someone at the project noticed the problem. I had a large number of these quickly crashing work units, on both my machines, but neither has downloaded new work for a couple of hours.Anyone that knows how BOINC works would know that's what happens when a load of rubbish goes out that just causes errors- BOINC backs off with it's Scheduler contacts, with increasing backoff periods. Hence no downloads for a couple of hours.. So why did you attribute what is normal BOINC behaviour under those circumstances to something the project did? And that is why i gave the answer i did- pointing out to someone that obviously didn't know what was normal BOINC behaviour & so not a result of any action the project may or may not have taken in relation to their faulty Tasks. Grant Darwin NT |
adrianxw Send message Joined: 18 Sep 05 Posts: 653 Credit: 11,840,739 RAC: 137 |
>>> Not nearly as puzzled as i am by yours. Oh, I see, my puzzlement is bigger than yours! I still have no idea of what you're on about. Rosetta USED to be a project with endless work units from their various work projects to download and crunch. Since late last year, that changed. Now, it is a project that produces a batch of work, lets that run, then stops for an unknown period, until the next batch starts. I saw new work from Rosetta coming in, and watched them crashing quickly. After seeing that, It wasn't wasting vast amounts of time, but felt no new tasks was appropriate. Looking at the sent times, I see that three would come, fail and return. About 30 minutes later, another three would come, fail and return. After a number of these cycles, I set no new tasks. Some time later, I opened downloads to see what the status was, and nothing came. Conclusion, they have all gone, or that someone at the project had stopped them. Wave upon wave of demented avengers march cheerfully out of obscurity into the dream. |
Greg_BE Send message Joined: 30 May 06 Posts: 5691 Credit: 5,859,226 RAC: 0 |
Someone put movingstubs on the server without beta testing it. It failed fast. One poster in another thread said Linux systems were running it ok, but all us windows users had the tasks fail immediately. Unfortunately, no one from the project pays attention to this and we had to download and error out x2 people all these tasks to get them out of the system. Such is life with this project now. |
adrianxw Send message Joined: 18 Sep 05 Posts: 653 Credit: 11,840,739 RAC: 137 |
>>> Someone put movingstubs on the server without beta testing it A little worrying. >>> no one from the project pays attention to this Would certainly seem to be the case. Wave upon wave of demented avengers march cheerfully out of obscurity into the dream. |
Greg_BE Send message Joined: 30 May 06 Posts: 5691 Credit: 5,859,226 RAC: 0 |
>>> Someone put movingstubs on the server without beta testing it But you should be used to that by now. I think for them as long as the science gets done one way or another, how it gets done is not relevant any more. You've been on here a little longer than me, so you remember the days when they cared. |
adrianxw Send message Joined: 18 Sep 05 Posts: 653 Credit: 11,840,739 RAC: 137 |
>>> little longer than me Yes, you are a relative newbie!!!!! If you consider the machines we were running back then, to keep run times sensible, the amount of calculation per work unit must have been a fraction of what we turn over today. Wave upon wave of demented avengers march cheerfully out of obscurity into the dream. |
xii5ku Send message Joined: 29 Nov 16 Posts: 22 Credit: 13,815,783 RAC: 419 |
adrianxw wrote: Rosetta USED to be a project with endless work units from their various work projects to download and crunch. Since late last year, that changed. Now, it is a project that produces a batch of work, lets that run, then stops for an unknown period, until the next batch starts.Rosetta@home had *always* occasional pauses. They weren't as frequent as now though. (In the Rosetta 4 workqueue, that is; the rosetta python projects workqueue permanently has work.) adrianxw wrote: I saw new work from Rosetta coming in, and watched them crashing quickly. After seeing that, It wasn't wasting vast amounts of time, but felt no new tasks was appropriate. Looking at the sent times, I see that three would come, fail and return. About 30 minutes later, another three would come, fail and return. After a number of these cycles, I set no new tasks. Some time later, I opened downloads to see what the status was, and nothing came. Conclusion, they have all gone, or that someone at the project had stopped them.No. The admins are still ignoring this bug and are blissfully submitting more work for the failing application. The work was only all gone for a short while because all workunits were finished in one of the following two ways:
(besides other normal failure causes, e.g. timeouts) |
adrianxw Send message Joined: 18 Sep 05 Posts: 653 Credit: 11,840,739 RAC: 137 |
There has been other work recently. At this moment, I have 12 work units running on my machines that have all gone past the rapid failure point I was seeing.Two of these are past 9 hours now and running for example. Wave upon wave of demented avengers march cheerfully out of obscurity into the dream. |
Message boards :
Number crunching :
computation errors
©2024 University of Washington
https://www.bakerlab.org